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Abstract. We consider the parametric estimation of tlie driving Levy process of a multivariate continuous-time 
autoregressive moving average (MCARMA) process, which is observed on the discrete time grid (0, h, 2h, . . .)■ 
Beginning with a new state space representation, we develop a method to recover the driving Levy process 
exactly from a continuous record of the observed MCARMA process. We use tools from numerical analysis 
and the theory of infinitely divisible distributions to extend this result to allow for the approximate recovery 
of unit increments of the driving Levy process from discrete-time observations of the MCARMA process. We 
show that, if the sampling interval h = is chosen dependent on N, the length of the observation horizon, such 
that Nh^ converges to zero as N tends to infinity, then any suitable generalized method of moments estimator 
based on this reconstructed sample of unit increments has the same asymptotic distribution as the one based on 
the true increments, and is, in particular, asymptotically normally distributed. 
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1. Introduction 

Continuous-time autoregressive moving average (CARMA) processes generalize the widely employed 
discrete-time ARMA process to a continuous -time setting. Heuristically, a multivariate CARMA process 
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of order (p, q) can be thought of as a stationary solution Y of the Unear differential equation 



[d'' + AiDP-^ + ...+Ap] Y(t) = [Bo + fiiD + . . . + B,D«] DL(f), D 



d^ 
df' 



p>q, 



(1.1) 



where L is a Levy process and A,, Bj are coefficient matrices, see Section 3 for a precise definition. They 
first appeared in the literature in [14], where univariate Gaussian CARMA processes were defined. Re- 
cent years have seen a rapid development in both the theory and the applications of this class of stochastic 
processes (see, e.g., [9] and references therein). In [8], the restriction of Gaussianity was relaxed and 
CARMA processes driven by Levy processes with finite moments of any order greater than zero were in- 
troduced (see also [12]). This extension allowed for CARMA processes to have jumps as well as a wide 
variety of marginal distributions, possibly exhibiting fat tails. Shortly after that, [27] defined multivari- 
ate CARMA processes and thereby made it possible to model a set of dependent time series jointly by a 
single continuous -time linear process. Further developments of the concept led to fractionally integrated 
CARMA (FICARMA, [7, 26]) and superpositions of CARMA (supCARMA, [4]) processes, both allowing 
for long -memory effects. In many contexts continuous-time processes are particularly suitable for stochas- 
tic modelling because they allow for irregularly-spaced observations and high-frequency sampling. We 
refer the reader to [3, 6, 42] for an overview of successful applications of CARMA processes in economics 
and mathematical finance. 

Despite the growing interest of practitioners in using CARMA processes as stochastic models for ob- 
served time series, the statistical theory for such processes has received little attention in the past. One 
of the basic questions with regard to parameter inference or model selection is how to determine which 
particular member of a class of stochastic models best describes the characteristic statistical properties of 
an observed time series. If one decides to model a phenomenon by a CARMA process as in Eq. (1.1), 
which can often be argued to be a reasonable choice of model class, this problem reduces to the three tasks 
of choosing suitable integers p, q describing the order of the process; estimating the coefficient matrices 
A,-, Bj; and suggesting an appropriate model for the driving Levy process L. 

In this paper, we address the last of these three problems and develop a method to estimate a parametric 
model for the driving Levy process of a multivariate CARMA process, building on an idea suggested in 
[11] for the special case of a univariate CARMA process of order (2, 1). The strategy is to observe that 
the distribution of a Levy process L is uniquely determined by the distribution of the unit increments 
AL„ = L(n) - L(n - 1); if one therefore had access to the increments (AL„)„=i over a sufficiently 
long time-horizon, one could easily estimate a model for L by any of several well-established methods, 
including parametric as well as non-parametric approaches ([15, 16] and references therein). It is thus 
natural to try and express the increments of the driving Levy process - at least approximately - in terms of 
the observed values of the CARMA process and subject this approximate sample from the unit-increment 
distribution to the same estimation method one would use with the true sample. One difficulty arising in 
this step is that one usually does not observe a CARMA processes continuously but that one instead only 
has access to its values on a discrete, yet possibly very fine, time grid; in fact, as we shall see in Section 4, 
it is this assumption of discrete-time observations that prevents us from exactly recovering the increments 
of the Levy process from the recorded CARMA process. 

In this paper, we concentrate on the parametric generalized moment estimators (see, e.g., [18, 30]) and 
prove that the estimate based on the reconstructed increments of L has the same asymptotic distribution as 
the estimate based on the true increments, provided that both the length of the observation period and the 
sampling frequency /i at which the CARMA process is recorded, go to infinity at the right rate. In fact 
we obtain the quantitative criterion that h - must be chosen dependent on such that Nh^ converges to 
zero as tends to infinity. The generalized method of moments (GMM) estimators contain as special cases 
the classical maximum likelihood estimators as well as non-linear least squares estimators that are based on 
fitting the empirical characteristic function of the observed sample to its theoretical counterpart. In view of 
the structure of the Levy-Khintchine formula, the latter method is particularly suited for the estimation of 
Levy processes. We impose no assumptions on the driving Levy process except for the finiteness of certain 
moments that depend on the particular moment function used in the GMM approach. In our main result. 
Theorem 6.5, we prove the consistency and asymptotic normality of a wide class of GMM estimators that 
satisfy a set of mild standard technical assumptions. 
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For some recent results about the estimation of discretely observed diffusion processes with jumps we 
refer the reader to [31, 32, 39] and their references. The theory developed in these papers is not applicable 
to the problem of estimating the driving Levy process of a discretely observed MCARMA process because 
MCARMA processes are not, in general, diffusions. 

It seems possible to relax the assumption of uniform sampling as long as the maximal distance between 
two recording times in the observation interval tends to zero. More important, however, is the natural 
question if there exist methods to estimate the driving Levy process of a CARMA process that do not 
require high frequency sampling but still have desirable asymptotic properties. Another interesting topic 
for further investigation is the behaviour of non-parametric estimators for the driving Levy process if they 
are used with a disturbed sample of the unit increments as described in this paper. 

LL Outline of the paper. The paper is structured as follows. In Section 2 we take a closer look at mul- 
tivariate Levy processes and infinitely divisible distributions, the fundamental ingredients in the definition 
of a multivariate CARMA process. First, we briefly review their definition and some important basic prop- 
erties. In Section 2.2 we obtain a new quantitative bound for the absolute moments of an infinitely divisible 
distribution in terms of its characteristic triplet, which is essential for many of the subsequent proofs. We 
also derive the exact polynomial time-dependence of the absolute moments of a Levy process in Proposi- 
tion 2.3. As a further preparation for the proofs of our main results. Theorem 2.4 in Section 2.3 establishes 
a Fubini-type result for double integrals with respect to a Levy process over an unbounded domain. 

The definition of multivariate CARMA processes as well as important properties, such as moments, 
mixing and smoothness of sample paths, are presented in Section 3. In Theorem 3.2, we prove an alternative 
state space representation for multivariate CARMA processes, called the controller canonical form, which 
lends itself more easily to the estimation of the driving Levy process than the original definition. 

In Section 4 we show that, conditional on an initial value, whose influence decays exponentially, one 
can exactly recover the value of the driving Levy process from a continuous record of the multivariate 
CARMA process. The functional dependence is explicit and given in Theorem 4.3. 

Since such a continuous record is usually not available. Section 5 is devoted to discretizing the result 
found in Theorem 4.3. To this end, we analyse how pathwise derivatives and definite integrals of Levy- 
driven CARMA processes can be approximated from observations on a discrete time grid, and we deter- 
mine the asymptotic behaviour of these approximations as the mesh size tends to zero. To our knowledge, 
this is the first time that numerical diff'erentiation and integration schemes are investigated quantitatively 
for this class of stochastic processes. The results of this section are summarized in Theorem 5.7. 

In Section 6, we prove consistency and asymptotic normality of the generalized method of moments 
estimator when the sample is not i.i.d. but instead disturbed by a noise sequence, which corresponds to 
the discretization error from the previous section. Theorem 6.2 shows that if the sampling frequency 
goes to infinity fast enough with the length of the observation interval, such that Nh^ converges to zero, 
then the effect of the discretization becomes asymptotically negligible and the limiting distribution of the 
estimated parameter is identical to the one obtained from an unperturbed sample. Finally, in Theorem 6.5, 
we apply this result to give an answer to the question of how to estimate a parametric model of the driving 
Levy process of a multivariate CARMA process if high-frequency observations are available. 

Finally, we present the results of a simulation study for a Gamma-driven CARMA(3,1) process in Sec- 
tion 7. 

Appendix A contains auxiliary results and some technical proofs that complement the presentation of 
our results in the main part of the paper 

1.2. Notation. Throughout the paper we use the following notation. The natural, real, complex numbers 
and the integers are denoted by N, R, C and Z, respectively. Vectors in R™ are printed in bold, and we use 
superscripts to denote the components of a vector, e.g., R'" 3 x - (x^ , . . . ,x'"). We write 0,,, for the zero 
vector in R'", and we let ||'|| and (■) represent the Euclidean norm and inner product, respectively. The ring 
of polynomial expressions in z over a ring K is denoted by K[z]. The symbols M„, „(K), or M„,(K) if m = n, 
stand for the space of m x n matrices with entries in K. The transpose of a matrix A is written as A^, and 
Im and Qm denote the identity and the zero element in M„{K), respectively. The symbol ||'|| is also used 
for the operator norm on Mm,«(R) induced by the Euclidean vector norm. For any topological space X, the 
symbol ^(X) denotes the Borel cr- algebra on X. We frequently use the following Landau notation: for two 
functions / and g defined on the interval [0, 1] we write f(h) = O (g(h)) if there exists a constant C such 
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that < Cg(h) for all h < 1. We use the notation |H|^,, for the norm on the classical U' spaces. The 

symbol A stands for the Lebesgue measure, and the indicator function of a set B is denoted by Ib(-), defined 

P d 

to be one if the argument lies in B and zero otherwise. We write — > and for convergence in probability 

and convergence in distribution, respectively, and use the symbol = to denote equality in distribution of two 
random variables. For a positive real number a, we write {a\) for the smallest even integer greater than or 
equal to a. 

Throughout the paper, the symbol h denotes a sampling interval or, equivalently, the inverse of the 
sampling frequency at which a continuous-time process is recorded. is the length of the observation 
horizon and thus also the number of unit increments of the the Levy process that can be reconstructed from 
observing the MCARMA process over that period. is not the total number of observations, which is N/h. 



2. Levy processes and infinitely divisible distributions 

2.1. Definition and Levy-Ito decomposition. Levy processes are the main ingredient in the definition of 
a multivariate CARMA process and an important object of study in this paper In this section we review 
their definition and some elementary properties. A detailed account can be found in [2, 36]. 

Definition 2.1. A (one-sided) R'"-valued Levy process (L{t))f^Q is a stochastic process, defined on the prob- 
ability space (D., ^ , P), with stationary, independent increments, continuous in probability and satisfying 
L(0) = 0,„ almost surely. 

Every R"'-valued Levy process (L(f)),>o can without loss of generality be assumed to be cadlag, which 
means that the sample paths are right-continuous and have left limits; it is completely characterized by its 
characteristic function in the Levy-Khintchine form Ee'^"'^*''^ = exp{?i/'^(H)), u e R'", f > 0, where i^^ has 
the special form 

^-^(H) = i</,H) - \{u,YPu) + r fe'<"'-> - 1 - i<H,^>/|||,||^,)l v^(djc). (2.1) 

The vector e R'" is called the drift, the non-negative definite, symmetric mxm matrix iP is the Gaussian 
covariance matrix and is a measure on R'", referred to as the Levy measure, satisfying 

v\\K]) = 0, min(||;«:||2, l)v^(d^) < ^. 

Put differently, for every f > 0, the distribution of L(f) is infinitely divisible with characteristic triplet 
{ty, tYP , tv^). By the Levy-Ito decomposition the paths of L can be decomposed almost surely into a 
Brownian motion with drift, a compound Poisson process and a purely discontinuous L^-martingale ac- 
cording to 

L(t) = yt + I.^-^'^W,+ f f xN(ds,<ix) + \im f f xN(ds,dx), 

Jm>1 Jo Jg^||j.||^j Jo 

where W is a standard m-dimensional Wiener process and E^ '^^ is the unique positive semidefinite matrix 
square root of iP. The measure is a Poisson random measure on R x R™\{0,„), independent of W 
with intensity measure A® describing the jumps of L. More precisely, for any measurable set B G 

^(RxR"'\{0,„)), 

N(B) = #{s>0:(s. Lis) - L(s-)) e B) , L(s-) lim L(f). 

Finally, N is the compensated jump measure defined by A^(di, djc) - N{ds,dx) - dsv^(dx). We will work 
with two-sided Levy processes L = (L(f)),gi).. These are obtained from two independent copies (Li(f)),>0' 
(L2(0)f>o of a one-sided Levy process via the construction 



Lit) 



Li(t), t > 0, 

-L2(-t-), t<0. 



In the following we present some elementary facts about stochastic integrals with respect to Levy pro- 
cesses, which we will use later. Comprehensive accounts of this wide field are given in the textbooks [2, 
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33]. Let / : R — » Mj ,„(W) be a Lebesgue measurable, square-integrable function. Under the condition that 
L(l) has finite second moments, the stochastic integral 

/= r /(5)dL(s) 

Jr 

exists in L^{Q.,¥). Moreover, the distribution of the random variable / is infinitely divisible with character- 
istic triplet (yj-,!,/, vj) which can be expressed explicitly in terms of the characteristic triplet of L via the 
formulas ([34, Theorem 2.7]) 

As, (2.2a) 



E/= r f{s)lPf{sfAs, (2.2b) 
Jr 

Vf{B)^ J J lBifis)x)v^idx)ds, B G i^(R''\{Od}). (2.2c) 

2.2. Bounds for the absolute moments of infinitely divisible distributions and Levy processes. In 

this short section we derive some bounds for the absolute moments of multivariate infinitely divisible 
distributions and Levy processes which will turn out to be essential for the proofs of our main results later. 
It is well known that the kth absolute moment of an infinitely divisible random variable X with characteristic 
triplet (7, E, v) is finite if and only if the measure v, restricted to {\\x\\ > 1), has a finite kth absolute moment. 
We need the following stronger result, which establishes a quantitative bound for the absolute moments of 
an infinitely divisible distribution in terms of its characteristic triplet. 

Lemma 2.2. Let X be an infinitely divisible, W^^ -valued random variable with characteristic triplet (7, 2, v) 
and let k be a positive even integer Assume that the constants c,, C,-, / = 1,2, satisfy 



j 
/ 



jcir v(dx) < Coc;,, r^2,...,k, (2.3a) 

WI<1 



\\x\\''v(dx)<Cic[, r=l,...,k. (2.3b) 

Then there exists a constant C > 0, depending on m and k, but not on (7, S, v), such that 

E\\X\\Uc[\\r\f + mf'^ + c^, + c\]. (2.4) 

Proof. Denote by vq - v||||i||<i) and vi = v||||i||ji) the restrictions of the measure v to the unit ball of R"' and 
its complement, respectively. It follows from the Levy-Khintchine formula (2.1) that we can construct a 
standard normal random variable W and two infinitely divisible random variables Xo,Xi, with characteristic 

triplets (0,„, 0,„, vq), (Om, 0,n, vi), and distributions fiQ, pi, respectively, such that X = 7 + Z'^^W + Xq + X\. 
Using the notation n\\ for the double factorial of the natural number n as well as [5, Eq. (4.20)] for the 
absolute moments of a standard normal random variable, the A:th absolute moment of the Gaussian part is 
readily estimated as 



which implies that 



V ,= I 



< ||E||'^''^7/+iE|w'r < {k- l)!!||S||*/2„/+' 



E \\X\t < 4* [1I7II* + m'^\k - 1)! ! Wl^f^ + E WXof + E \\Xi \f] . (2.5) 

The first two terms in this sum are already of the form asserted in Eq. (2.4). We next consider the fourth 
term. By construction, the characteristic function of Xi is given by 



fM(u) := Ee'<"'^'> = exp f f [e'<"'^> - ll v(dx)\ , u 
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By assumption (2.3b) and [36, Corollary 25.8], J ||Ai:||*yUi(d;c) < oo and [36, Proposition 2.5(ix)] shows that 
the mixed moments of Xi of order k are given by 

1 d'' 
■ x'l'ij.iidx) = ^ — 



Jr'" 



■ du'" 



yUl(M) 



ij — 1, . . . , in. 



It is easy to see by induction that 



du'' 



nePt Ben [jeB 



v(dx). 



where !Pi denotes the set of partitions of { 1 , 2, . . . , A:), a partition being a subset of the power set of { 1 , . . . , A;) 
with pairwise disjoint elements such that their union is equal to {1, . . . , fe). We write #7: for the number of 
sets in a partition n and |B| for the number of elements in such a set. Setting u = 0,„, specializing to ij = / 
and making use of the assumption that k is even, the last display yields the explicit formula 



x), ; = 



nePt Ben ' 

Using the fact that x' < for every x e R'" as well as assumption (2.3b) we thus obtain that 

m p 

E||Xi|Nm*/2j]E|rif <m*/2+i^ j-[ v(dA:) < cX^^+i ^ . (2.6) 



nePi Ben 



nePt 



The third term in Eq. (2.5) can be analysed similarly: the characteristic function of Xq has the form 

JTaiu) := Ee'^"'^»> = exp | f [e'<"'^> - 1 - \{u,x)\ v{Ax)\ , m e R™. 

UlWI<i J 

With vo having bounded support, all moments of Xq are finite, which implies that jUq is infinitely often 
difFerentiable and that the mixed moments of Xq are given by partial derivatives of /Jq, as before. The 
additional compensatory term '\{u,x) in the integral ensures that the first derivative of JTq vanishes at zero, 
which leads to 

E 141* = E (4)* = z n r {""T = 1 ' • ■ ■ . 



nePt Ben 
min[\B\,Ben]>2 



Using assumption (2.3a) we can thus estimate 



E||Xo|Nm*/2;^E|x;f <m*/2-i f[ v(d;.) < cf^^^^' ^0 

/=1 nePt Ben -'ll->^ll<l 

min||B|,Be;r|>2 

The bounds (2.5) to (2.7) show that the claim (2.4) holds with 



(2.7) 



nePk 
min||B|.B£;r|>2 



C 4* 



m*+'(A:- 1)!! 



kl2+\ 



2 cf + 2 



nePt 



nePt 
mvL\\\B\,Ben\>l 



Since the marginal distributions of a Levy process L are infinitely divisible, the behaviour of their 
moments can be analysed by the previous Lemma 2.2. We prefer, however, to give an exact description of 
the time-dependence of E ||L(f)||* for even exponents k and derive from that the asymptotic behaviour as f 
tends to zero. 

Proposition 2.3. Let kbe a positive real number and Lbe a Levy process, 
i) Ifk is an even integer and E ||L(1)||'^ is finite, then there exist real numbers mi, . . . ,mk such that 



\ \\L{t)\t -mit + ...+ nikt'', t > 0. 



(2.8) 



ii) //E||L(1)||W'' is finite, thenEWLmf = C>(/i*/«») as h ^ 0. 
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Proof. For the proof of i) we introduce the notation K {u^ (f), • ■ ■ , 1 < ; i , . . . , 4 < for the mixed 

cumulants of L(f) of order k. They are defined in terms of the characteristic function of L as 

^k 

K (l" (f), . . . , = — — log Ee'<"'^<'» 



dui. ■ ■ ■ du„_ 



H=0,„ 



and are clearly homogeneous of degree one in t. There is a close combinatoric relationship between mo- 
ments and cumulants, which was used implicitly in the proof of Lemma 2.2 and which explicitly reads (see 
[40, §12, Theorem 6]): 

k 



KePt Ben 



nePt Ben 



where 



nePt Ben 
#n=K 



Writing k - 21, the Multinomial Theorem implies that 



wmw" = 



{L\t)f 



+ iL'\t)Y 



0</i /,„</ 

h+...+l,„=l 



and thus it follows, by what was just shown and the linearity of expectation, that 



E||L(f)||* = Yj 



z 



"'1 2/,;, limes 



/I k,K 



r. 



This proves Eq. (2.8). Assertion ii) follows for even k directly from the polynomial time-dependence of 
E ||i.(OII* which we have just established. For general k we use Holder's inequality which implies that 

E||L(Oll'^ < (E||L(?)||<*'°)^ 
and since (fc)o is even by definition the claim follows again from part i). □ 

2.3. A Fubini-type theorem for stochastic integrals with respect to Levy processes. The next result 
is a Fubini-type theorem for a special class of stochastic integrals with respect to Levy processes over an 
unbounded domain. 

Theorem 2.4. Let [a, b] <Z W be a bounded interval and L be a Levy process with finite second moments. 
Assume that F : [a,^?] x R ^ Mj ,„(R) is a bounded function, and that the family [u i-^ F{s,u)]se[a,b] 
uniformly absolutely integrable and uniformly converges to zero as \u\ — > oo. It then holds that 



Ja Jr 



F(s, u)dL(u)ds 



F(s, u)dsdL(u), 



(2.9) 



almost surely. 



Proof. We first note that since L has finite second moments and F is square-integrable, both integrals in 
Eq. (2.9) are well-defined as L^-limits of approximating Riemann-Stieltjes sums. We start the proof by 
introducing the notations 



«— > 
/ 



I - \ I F{s, u)dL(u)ds, In - \ I F(s, u)dL(u)ds, 

Ja Jr Ja J-N 

N J a 



<— > 

F(s, u)dsdL{u), In 



F(s, u)dsdL(u). 



It follows from [22, Theorem 1] (see also [33, Theorem 64]) that, for each A^, In = In almost surely. We 
also write 

An --I -In, An-= I - In, N > 0. 
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The strategy of the proof is to show that both and converge to zero as tends to infinity, and then 
to use the uniqueness of Umits to conclude that / must equal / . We first investigate E \\Am\\ ■ Clearly, 



Ja J\u\>N 



F(s, u)dL(u)ds. 



(2.10) 



Consequently, in order to analyse the absolute moments of A^ it suffices to consider the absolute moments 
of the infinite divisible random variables J^^^^^^ F(s,u)dL(u), s e [a,b]. By Eqs. (2.2), their characteristic 
ti-iplets (7^^, S^jy, v;^_^) satisfy 

||r^w||<r IF(^,«)l|cl«||/||+ r \\F(s,u)\\ f \\x\\Iii^^(ms,u)x\\)v^(dx)du 

J\u\>N J\u\>N J\M\<1 

f \\F(s,u)\\ f \\x\\Iio.i]ms,u)x\\)y^mdu 

J\u\>N JM>1 

f \\x\\y^(dx) 

J\w\>i 



l\u\>N 

< I \\F(s,u)\\du 

J\u\>N 



(2.11) 



for all exceeding some A^o which satisfies \\F{s, u)\\ < 1 for all \u\ > N{), s e [a, b]; Such an A^o exists by 
assumption. Similarly, one obtains that 



J\ii\>N 



\\F(s,uWdu < 



M[ I 

J\ii\>N 



\\F(s,u)\\du, VA?>A^o- 



(2.12) 



and 



r Wxfv'^^idx)^ f f Iio,iMF(s,u)x\\)\\F(s,u)xfv\dx)du 

J\\x\\<l ' J\u\>N JW 

< r \\F(s,u)\\du f Wxfv'^idx), VA?>M), 

J\u\>N Jwi 

f iwrv^^(d^)= r r 

J||l||>l J\u\>NJm 

< r r Ai.oo]( 

J\u\>N Jwi 



P^idx)^ \ \ Iii,oc)(\\F{s,u)x\\)\\Fis,(u)x\\'v^(dx)du 

\u\>N . 



\\x\\)\\Fis,uW\\x\\'v'^(dx)du 



■ f \\F(s,u)\\du f 

J\u\>N Jib 



\\xfv''(dx), r=l,2. 



|j:||>max{l,||f|r 



Applying Lemma 2.2 with k - 2 and using the assumed uniform absolute integrability of the family 
{u i-> F(s, M)) si=[fl i] we can deduce that 



sup E 

se[a,b] 



J\u\>N 



F(s, u)dL(u) 



0, as A^ — > oo. 



Together with Eq. (2.10) and Jensen's inequality this implies that 

\2 



<E 



Ja J\u\ 



F{s, u)dL(u) 



F(s, u)dL(u) 



ds 



ds ^ (b - a) sup I 

se[a,b] 



J\ii\>N 



F(s, u)dL(u) 



(2.13) 



as A' — > oo, showing that A^ converges to zero in L . In order to prove the same convergence also for 



^ ^ r 

Af^ - I - In - I I F(s, u)dsdL(du), 

J\ii\>N Ja 



we first define the function F ; R — > M^,„(R) by F(u) - F{s, u)ds. Since for all m e R, ||-F(m)|| is smaller 
than (b - a) ||f |lL"([a,fc]xR)' the function F is bounded. It is also integrable because the normal variant of 
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Fubini's theorem and the assumed uniform integrability of {F(s, ■)}se[a,h] imply that 
r ||F(M)||dM< r r \\F{s,u)\\duds^{b-a)sup f \\F(s,u)\\du 

J\u\>N Ja J\u\>N sela,b] J\u\>N 



0, A^^oo. 



Similar arguments to the ones given above then show that converges to zero in as well. It thus 
follows by the triangle inequality that, for every and every e, 



<— >ii 



)<p({ll/-^ 
<p({ll/-. 



111 



«— > <— > 
I - I A 



)■ 



where we have used the subadditivity of P as well as the fact that is equal to / a; almost surely. Since 

L^-convergence implies convergence in probability ([19, Theorems 17.2]), it follows that the right hand 

side of the last display is less than any positive 6 if only is large enough and thus that the probability of 

«— > 

the absolute difference between / and / exceeding e is equal to zero for every positive e. This means that 
«— > 

/ equals / almost surely and completes the proof □ 

3. Controller canonical parametrization of multivariate CARMA processes 

Multivariate, continuous-time autoregressive moving average (abbreviated MCARMA) processes are 
the continuous-time analogue of the well known vector ARMA processes. They also generalize the much- 
studied univariate CARMA processes to a multidimensional setting. A li-dimensional MCARMA process 
Y, specified by an autoregressive polynomial 

P(z) = + Aiz^-' +...+Ape M^(R[z]), (3.1) 

a moving average polynomial 

Qiz) ^Bo + Biz + ... + B!,z^ e M^,„,(R[z]), (3.2) 
and driven by an m-dimensional Levy process L is defined as a solution of the formal differential equation 

d 



F(D)F(0 = e(D)DL(f), D = 



At 



t e . 



(3.3) 



the continuous-time version of the weU-known ARMA equations. Equation (3.3) is only formal because, 
in general, the paths of a Levy process are not differentiable. It has been shown in [27] that an MCARMA 
process Y can equivalently be defined by the continuous-time state space model 

dZ(f) = AZ(f)df + l5AL{t), Y{t) = CZ(f), t e R, (3.4) 

where the matrices A,/3 and C are given by 



A = 









-A, 







-Ap-i 








h 



■■■ Pl)^ ^ M/5rf,m(R), Pp-j = -/|0....,?)0) 



and 



C=(lrf,0,...,0)eM^,^^( 



(3.5a) 



(3.5b) 



(3.5c) 



This is but one of several possible parametrizations of the general continuous-time state space model and 
is in the discrete-time literature often referred to as the observer canonical form ([21]). For the purpose 
of estimating the driving Levy process L it is more convenient to work with a different parametrization, 
which, in analogy to a canonical state space representation used in discrete-time control theory, might be 
called the controller canonical form. It is the multivariate generaUzation of the parametrization used for 
univariate CARMA processes in [11]. We first state an auxiliary lemma which we could not find in the 
literature. 
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Lemma 3.1. Let r, s be positive integers. Assume thatR{z) = z'' + Miz^ ^ + . . . + Mr e Mj(R[z]) is a matrix 
polynomial and denote by 



M 



1. 





1. 



(3.6) 



■■■ h 

-Mr -Mr-\ -Mr-2 ' ' ' -Ml 

the associated multi-companion matrix. The rational matrix function 

S{z) = = (zl„ - M)-i G MMzY), Sijiz) g M, 

is then given by the following formula for the block S ij(z): 

. Iz'-^^'-jl, + ZZfi Mkz'-^-'^^'-j 



Sijiz) =R(zr 



J>i, 

[-n.r-J.lM,f-'-'-'-j, j<i. 

Proof. We compute the (;, 7)th block of S (z) (zlrs - M). Assuming ; < j, this block is given by 



(3.7) 



(3.8) 



[S iz) izlrs - = ^ 5 aiz) izlrs - mj 



k=l 



=zSijiz) - 5y_i(z) + Siriz)Mr.j^i 



=R(zr 



r-J 



r-j+\ 



'M, 



= 0. 



A similar calculation shows that for i > j, [S (z) izlrs ~ M)]y = 0. For the blocks on the diagonal we obtain 
for i > 2, 



[S iz) izlrs - M)],. = 5 ikiz) izlrs - 



k=\ 



=zSiiiz)-Su.,iz) + Siriz)Mr.M 



=Riz)- 



z'h + YMkz'-'+ Y M,z'-'+z'-'Mr-M 



k=l 



k=r-i+2 



Is, 



and finally 



[Siz) izlrs -M)]n = YjSikiz) izlrs -M)i,i ^zSniz) + Siriz)Mr 



k=l 



^Riz)- 



r-1 



M, 



k=\ 



This shows that S iz) is the inverse of zl„ - M and completes the proof. □ 

Theorem 3.2 (Controller canonical state space representation). Assume that L is an m-dimensional Levy 
process and that Y is a d-dimensional L-driven MCARMA process with autoregressive polynomial P G 
^d(Rb]) and moving average polynomial Q g M4^„iR[z]). Then there exist integers p > q> and matrix 
polynomials 

z ^ Piz) ^z" + Aiz"-' + ...+A„e M,„(R[zl), (3.9a) 
z ^ Qiz) =Bo + Biz + . . . + Bqz" e Md,miMz]) (3.9b) 

satisfying Piz)~^Qiz) = Qiz)Piz)'^ for allz&C and det/'(z) = if and only if det Piz) = 0. Moreover, the 
process Y has the state space representation 

dXit) =AXit)6t + EpdLit), (3.10a) 
Yit) =BXit), (3.10b) 



ESTIMATION OF THE DRIVING LEVY PROCESS OF MCARMA PROCESSES 



11 



where 



A = 








-A„ 










-Ap-2 



e M™(R), 








B 



\ Bo Bi ■■■ ] e Mj,p„,(R), Bj^Qd,m, q+Kj^p-l. 



(3.11a) 



(3.11b) 



Proof. The existence of matrix polynomials P e Mm(R[z]) and Q 6 M„j d(S\z\) with the asserted properties 
has been shown in [21, Lemma 6.3-8]. In order to prove Eqs. (3.10) it suffices, by [38, Theorem 1], to prove 
that the triple (A, Ep, B), defined in Eqs. (3. 1 1), is a realization of the right matrix fraction QP \ that is 

B [z V, - a]"' Ep = Q(z)P(zr\ Vz e C. 

Using Lemma 3.1 and the fact that right multiplication by Ep selects the last block-column one sees that 

-1 



[zlp,n -Af Ep^[l z 



^ ®p{zr 



where ® denotes the Kronecker product of two matrices. By definition it holds that 



r 



Bo + BiZ + 



+ B,z'> 



and so the claim follows. 



In view of Theorem 3.2 one can assume without loss of generality that an MCARMA process Y is 
given by a state space representation (3.10) with coefficient matrices of the form (3.11). We make the 
following assumptions about the zeros of the polynomials P, Q in equations (3.9). The first one is a 
stability assumption guaranteeing the existence of a stationary solution of the state equation (3.10a). 

Assumption Al. The zeros of the polynomial det P{z) e R[z] have strictly negative real parts. 

The second assumption corresponds to the minimum-phase assumption in classical time series analysis. 
For a matrix M € M^/,„(R), any matrix M~' satisfying M~^M = 1„, is called a left inverse of M. It is easy 
to check that the existence of a left inverse of M is equivalent to the conditions m < d, rank M = m, and 
that in this case M~' can be computed as M~' = {M^ MY^M^ . 

Assumption Al. The dimension m of the driving Levy process L is smaller than or equal to the dimension 
of the multivariate CARMA process F, and both Bq and B^Bq have full rank m. The zeros of the polynomial 
d&\.B~^Q(z) 6 R[z] have strictly negative real parts. 

It is well known that every solution of Eq. (3.10a) satisfies 



Z(f) = e^('-''Z(i) + 



J'eA(r-„ 



^EpdL(u), 



s,t e . 



s < t. 



Under Assumption Al, the state equation (3.10a) has a unique strictly stationary, causal solution given by 



Z(f) = r s^^'-"^EpdLiu), t e R. 

%J — CO 

and consequently, the multivariate CARMA process Y has the moving -average representation 



Y(t) 



f 



g{t - u)dL(u), t e R; g(t) = Be'^'£,,/[o,oo](0- 



We recall that we denote by X'{t) the ith component of the vector X(t) and define, for j - 1, . 
jth m-block of X by the formula 



t e . 



(3.12) 

(3.13) 
. , p, the 

(3.14) 



Z<^)(f) = [ Z(-'-i>"+i(f)^ ■■■ ^■''"(f)^ ] 
A very useful property, which the sequence of approximation errors (AL„ - AL(n)^ might enjoy, is 

asymptotic independence; heuristically this means that AL„ - AL(n) and AL„, - AL(m) are almost in- 
dependent \f \n - m\ » 1. One possibility of making this concept precise is to introduce the notion of 
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Strong (or a-) mixing, which has first been defined in [35]. Since then it has turned out to be a very 
powerful tool for establishing asymptotic results in the theory of inference for stochastic processes. For 
a stationary stochastic process X = {X,)tei, where / is either R or Z, we first introduce the cr-algebras 
= criXj : j e I,n < j < m), where -oo < « < m < oo. For m e I, the strong mixing coefficient a(m) is 
defined as 

a(m) = sup |P(A n B) - P(A)P(B)| . (3. 15) 

The process X is called strongly mixing if lim,„^oo a{m) = 0; if a(m) = 0(A'") for some Q < A < 1 it is 
called exponentially strongly mixing. 

4. Recovery of the driving Levy process from continuous-time observations 

In this section we address the problem of recovering the driving Levy process of a multivariate CARMA 
process given by a state space representation (3.10), if continuous-time observations are available. We 
assume that the order (p, q) as well as the coefficient matrices A and B are known. If they are not they 
can first be estimated by, e.g. maximization of the Gaussian likelihood [37], although the precise statistical 
properties of this two-step estimator are beyond the scope of the present paper. More precisely, we show 
that, conditional on the value X(0) of the state vector at time zero, one can write the value of Lit), for any 
t e [0, r], as a function of the continuous-time record (F(f) : < f < T). In particular one can obtain an 
i.i.d. sample from the distribution of the unit increments L(n) - Lin - 1), 1 ^ n ^T, which, when subjected 
to one of several well-established estimation procedures, can be used to estimate a parametric model for L. 
It can be argued that most of the time a continuous record of observations is not available. The results of 
this section will, however, serve as the starting point for the recovery of an approximate sample from the 
unit increment distribution based on discrete-time observation of Y, which is presented in Section 5. 

The strategy is to first express the state vector X in terms of the observations Y and then to invert the 
state equation (3.10a) to obtain the driving Levy process as a function of the state vector. We first define 
the upper q-block-truncation of X, denoted by Xq, by 

Xqit) = [ z('>(f)^ ■ ■ ■ x'^Kti \ 

where the m-blocks Z'^-'^ have been defined in Eq. (3.14). 



f e . 



Lemma 4.1. Assume that L is a Levy process and that Y is a multivariate CARMA process given as a 
solution of the state space equations (3.10). If Assumption A2 holds, the truncated state vector Xq satisfies 
the stochastic differential equation 



dXqit) = BZ^(f)df + EqYit)dt, 



where 



B 








-B~^Bo 





-Bq'Bi 





-Bq'B2 






In 

-b:'b. 



e MUR), Eq 










e M, 



mq,d\ 



(4.1) 



(4.2) 



-9 -"9-1 

and B~^ denotes the left inverse ofBg. Moreover, the eigenvalues of the matrix B have strictly negative real 
parts. 

Proof. Equation (4.1) follows easily from combining the first q block-rows of the state transition equation 
(3. 10a) with the observation equation (3. 10b). The assertion about the eigenvalues of B is a consequence of 
the well-known correspondence between the eigenvalues of a multi-companion matrix and the zeros of the 
associated polynomial, see, e.g., [27, Lemma 3.8]. By this correspondence, the eigenvalues of B are exactly 
the zeros of the polynomial det (l„z'' + B~'B^-iZ^"' + . . . -h B~'Bo), whose zeros have strictly negative real 
parts by Assumption A2. □ 



As before we see that Eq. (4. 1) is readily integrated to 

Xqit) = e^^'-'>Xqis) + e^^'-"^EqYiu)du, 



s,t e . 



s < t. 



(4.3) 
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The remaining blocks X^'\ q < i p, are obtained from Xq and Y by differentiation. The existence of the 
occurring derivatives of the state process X and the MCARMA process Y is guaranteed by Lemma A. 1 in 
the appendix. 

Lemma 4.2. For 1 ^ n ^ p - q, the block Z**^"' is given by 



B«Z,(f) + 2B"-i-''£,D''y(f) 

v=0 



f e . 



(4.4) 



Proof. We first observe that Eqs. (3.10) and Eq. (4. 1) imply that 

X(i+"\t) = DZ<«+"""(r), DZ^(f) = BXgit) + EgY(t). 
Therefore the claim is true for n - I. Assuming it is true for some l<n<p-^it follows that 

Z('?+"+i)(f) =DZ(«+"\0 



=E. 



^E. 



B"X,{t) + Y,^"-'-'E,p''Y(t) 

y=0 

n-l 

B"+'Z,/f) + B"EgY(t) + B"-^-''EgD''^'^Y(t) 

v=0 

n 

B"^'Xq(t) + J]B"-'EqD''Y(t) 



Equations (4.3) and (4.4) allow to compute the value of X(t) based on the knowledge of the initial value 
Z(0) and {Y{s) : < i < f). In order to obtain the value of L(f) we integrate the last block-row of the state 
transition equation (3.10a) to obtain 



L(f) = X'^Ht) - X^P\0) +A f X(s)ds 

Jo 



(4.5) 



whereA = [Ap ... Ai j. We also write A^^ = [ Ap ... A;,_,+i j. 

Theorem 4.3. Let Y be the multivariate CARMA process defined by the state space representation (3.10) 
and assume that Assumption A2 holds. The increment AL„ = L(n) — L(n — I) is then given by 

AL„ = ElBP-^-'-'E, + '5 Ap^,^t-iElB'-'E, [D'Y{n) - D'Y(n - 1)] 

L k=v 

P-1 ] 

A^B-' + 2 Ap_,_,+i4B^-' + ElB"-" [X,(n) - X,(n - 1)] 

k=l J 

A^B-'Bo] r Y{s)ds 

Jn-l 

Z,/«) = e''Z^(n - 1) + r e^^"-"^EgY(u)du, n > 1. 

Jn-l 



and 



(4.6) 
(4.7) 



Proof. Substituting Eq. (4.4) into Eq. (4.5) leads to 

p-q-l r p-9-2 



Eg [D^'Yin) - D''F(« - 1)] 



r 1 r-n 

E^BP-" [Z,(«) - Z,(« - 1)] + A^ + ^ Ap_,_i+i4B* X,(s)ds 

L k=i J ^"-1 



P-l r-n 

+ Y,^P-,-k^xElB'-'Eg Y{s)ds. 
k=\ -'"-1 
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Assumption A2 implies that B~^Bq is invertible and, by Lemma 3.1, the matrix B is invertible as well. 
Thus, integration of Eq. (4.1) shows that 



r Xq(s)ds^B-' Xq(n) - Xq(n - I) - Eq f 

Jn-l L Jn- 



Y{s)ds 

-I 

Plugging this into the last expression for AL„ and using the equality A^B^^Eg - Ap |^Z?~'Z?o] proves 
Eq. (4.6). Equation (4.7) follows from setting f = n, i = n- lin Eq. (4.3). □ 

In order to keep the notation simple we restrict our attention to unit increments AL. In all our arguments 
and results, AL„ can be replaced by A^L,, := L(n6) - L((n - 1)6) for some 6 > 0. 

5. Approximate recovery of the driving Levy process from discrete-time observations 

In this section we consider the question of how to obtain estimates of the increments AL„ of the driving 
Levy process based on a discrete-time record of the multivariate CARMA process Y. The starting point 

is Eq. (4.6) which expresses the increment AL„ in terms of derivatives and integrals of Y. In order to 

— CO 

approximate AL„ by a function AL„ of the discrete-time record, it is therefore necessary to approximate 
these derivatives and integrals. For this purpose we will employ forward differences (Eq. (5.1)) and the 
trapezoidal rule of numerical integration (Eq. (5.4)). We always assume that values of Y are available at the 
discrete times (Q,h,2h, . . .,T) only. For notational convenience we also assume that h^^ e N; our results 
continue to hold if this restriction is dropped. 

Our main result in this section is Theorem 5.7. It states that the moments of the approximation error 

" W 1/9 _i 

AL„ - AL„ are of order h ' , and thus converge to zero as the sampling frequency h tends to infinity. 
Before we can prove this result we need to give a quantitative account of the approximation theory of 
derivatives and integrals of MCARMA processes; this is achieved in Sections 5.1 and 5.2, respectively. 

5.1. Approximation of derivatives. Throughout we will approximate derivatives by so-called forward 
differences which can be interpreted as iterated difference quotients. For a general introduction to finite 
difference approximations, see [25, Chapter 1]. For any function / and any positive integer v we define 

Kim) ^ Z ^"^^"'(l)-^(^ + ^^-^^ 

It is apparent from this formula that knowledge of / on the discrete time grid (0, /i, . . . , T) is sufficient to 
compute A)'J/](f) for any t e [Q,T - vh] n hZ. We will consider the differentiation of integrals of functions, 
for which we introduce the notations 

//(f) := j| /(^)d^, as well as ^J^,, A^//] («) - /(«) (5.2) 

for the corresponding approximation error In the next lemma we analyse this approximation for the case 
when / is a Levy process. 

Lemma 5.1. The sequence of approximation errors e] is i.i.d. Moreover, for every w e Q and for every 
integer n the approximation error e^^^ converges to zero as h ^ 0. If, for some positive integer k, the 

absolute moment E ||L(l)||'-*'° is finite, then E = (9(/i*^**-'°), as h ^ 0, where the constant implicit in 

the 0{-) notation does not depend on n. 

Proof. We first observe that 



Whin + h)- hin) - hUn) 



[L{s)-L{n)]ds < \\L{s) - L(n)\\ds. 



The right continuity of f i-» L(f) implies that for every integer n and each e > there exists a S^^n such 
that \\L{n + t) - L{n)\\ < e, for all < f < 6e,n- This means that ||/i(n + h) - lL(n) - hL(t)\\ < he, provided 

(h) (h) 

h < 6e,n- Dividing by h thus proves e}^^ — > 0. The proof also shows that e^^^ is a determirustic function 
of the increments {L(s) - L(n), n ^ s ^ n + h). Since the increments of a Levy process are stationary and 
independent, this implies that ef^^ is an i.i.d. sequence. 
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For the second claim about the size of the absolute moments of c'/'^ for small h it is no restriction to 
assume that n - 0. Successive application of the triangle inequality and Holder's inequality with the dual 
exponent k' determined hy 1/k + l/k' = I shows that 



II II* 1 r 

II '^-"ll M Jo 
Using k/k' - k - I it follows that 



4E(]|"||L(.)||d.[<J,I 



||L(i)||^d5j U lds\ 



Since ||L(.s)||'^ is positive we can interchange the expectation and integral. By Proposition 2.3, E ULCs)!!*^ is 
of order which impHes that \\ef^o\\ = OC/i*^/*"). □ 

Lemma 5. 1 was dedicated to the analysis of the error of approximating the first derivative of the integral 
of a Levy process. We will also need analogous results for higher order derivatives of iterated integrals 
of Levy processes. The proofs are similar in spirit and only technically more complicated. For a positive 
integer v we generalize the notations (5.2) to 

I}(t) ^ I}-\s)ds, l}(t) ^ f(s)ds, and e;;*^) := A); [/;](«)- /(«). (5.3) 

Clearly, if the function / has only countably many jump discontinuities then D^F[f]{t) - fit) almost 
everywhere. 

Lemma 5.2. For every positive integer v > 1 and every integer n, the error e^ji'^'l converges to zero as 

h 0. If, moreover, E ||L(1)||*° is finite for some k > 0, then E \ef'^ = 0(h'''^''><''>) as h 0. 

Proof. Deferred to the appendix. □ 

With these auxiliary results finished, we turn to approximating derivatives of the multivariate CARMA 
process Y. This is the first big step towards discretizing Eq. (4.6). 

Proposition 5.3. Let Y be an L-driven multivariate CARMA process satisfying Assumption Al, let n 
be an integer and denote by e^y^J — AJJ[y](n) - \yY(n) the error of approximating the vth derivative ofY 
by the forward differences defined in Eq. (5.1). Assume that, for some A; > 0, E ||L(l)||'-*^^° < oo. It then holds 
that: 

i) // 1 < V < p - ^ - 2, then E Wef^^f = 0(h''). Ifv = p-q-\, then E We^y'^jf = 0(h''"-''^«). 

ii) The sequence e'y''^ is strictly stationary and strongly mixing with exponentially decaying mixing coef- 
ficients. 

Proof. We first prove the assertions i) about the behaviour of the absolute moments of e^y^^ for small 
values of /i. If 1 < v < /:> - ^ - 2 it follows from Lemma A. 1 that the paths of Y are at least v + 1 times 
difFerentiable; therefore. Lemma A. 3 implies that < h ^\yp„^sKn+vh ||D''^'F(5)||. To prove the claim it 

is thus sufficient to show that E sup,,^j^,,_|_y;, ||D^^' 7(5)11*^ < oo. By the defining observation equation (3. 10b), 
F is a linear combination of the first q+ \ m-blocks of the state process X; the state equation (3. 10a) implies 
DZ' = Z'^' , / = 1 , . . . , p - 1, and since v is assumed to be no bigger than p - ^ - 2 it follows that D''^' F is a 
linear combination of the first p- \ m-blocks of X, say D''+'y - LambdaX, for some matrix A e Mij^p,„(R.). 
We can then apply Lemma A.4 to estimate 



E sup ||D''+'y(.s)||* < ||A||*E sup ||Z(.s)||* < oo, 



which proves the first claim. Ify = p-<7-lwe start again from the observation that F is a linear 
combination of the first ^ + 1 m-blocks of Z, namely, 

Y(t) = B^X,(t) + B^X^^^'Ht), teR, = [ Bo • ■ • Vi ] ■ 
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By solving the last p - q + I block-rows of the state equation (3.10a) one can express Z*''^'' as 

(p-q-iy. 

where the notation /J for the v-fold iterated integral of a function / has been introduced in Eq. (5.3). By 
linearity and the fact that AJJ[p] - D''p = for polynomials p of degree v (Lemma A.3,ii)), it follows that 

^B^ [Af "-'[Z.Kw) - D^-«-'Z,(«)] 
- B,A [Af [Pp\ in) - D^-^-'/f ^(«)] 
+ B, fA^-*-> f/f in) - D'-^'-'/f -'(«)] . 

Both Xq (by Lemma A. 1) and l'^ are p - q times differentiable so we can apply Lemma A.3,iii) to bound 
the differences in the first two lines of the last display by h times the supremum of the (;?-^)th derivative of 
Xg and respectively. The contribution from the last line is the approximation error for the {p-q-l)th 
derivative of the (p - q - l)-fold iterated integral of the Levy process L which has been investigated in 
Lemma 5.2. We thus obtain that 



W Y,n ^ " 



\B \ sup D"-%(f) + Bj A sup ||Z(OII 



i\\ 11—11 



p-q-\,{h) 



As before, one shows that the first term has finite A:th moments which is of order (9(/i*). The second term 
has been shown in Lemma 5.2 to have finite Mi moment of order (9(/!*^**'°) which dominates the first term 
for h <\; this completes the proof of i). 

In order to prove that the sequence e^^''* is strongly mixing, it is enough, by virtue of Lemma A.l,iv) 
and Lemma A. 2, to show that the approximation error e'y'^^ is measurable with respect to '3''"^^'\ the cr- 
algebra generated by {Y{t) . n ^ t ^ vh]. Clearly, A^[y](f) is measurable with respect to the cr-algebra 
generated by {Y,, 7,+;,, . . . , Fj+y/,). By the definition of derivatives as the limit of different quotients and 
the assumed differentiability of f F(f), the derivative D]Y, is the w-wise limit, as s goes to zero, of the 
functions w AJ![y(w)](f). Each of these functions is measurable with respect to a-{Y,, Y,+s, ■ ■ ■ , Y,+ys), 
and therefore in particular with respect to the larger cr-algebra Since pointwise limits of measurable 

functions are measurable ([23, Theorem 1.92]), the claim follows. 

The claim that the sequence e^^''* is strictly stationary is a consequence of the fact that the multivariate 
CARMA process Y is strictly stationary (Lemma A.l,i)). By the definition of stationarity it is enough 
to show that for every natural number K, all indices n\, . . . ,nK e Z and every integer k, the two arrays 
(e^v'^K Cv*'"' ) and (Cv^''^ e^v^^ , i ) have the same distribution. We first observe that for each n e Z 

and each w € Q, Cy^t^ = lim^^ot where ey**''' := A]J[y](n) - Aj[y](n). In particular, since w-wise 

convergence implies convergence in distribution, it holds that 

V«^y „i > ■ ■ • '*^y,„x > ^VCy_„, > ■ ■ • '«^y,„jfA 

(V,(h,s) v,(h,s) s v,(/i) v,(j) s 

as s tends to zero. For every finite s, the strict stationarity of Y implies that (Cy*''''', . . . ,ey*''''') is equal in 



jn follows from the fact 

limits are uniquely determined ([23, Remark 13.13]). 



distribution to {e^Yn^lk' ■ ■ ■ '^yn's^+i-*' assertion then follows from the fact that in Polish spaces weak 



5.2. Approximation of integrals. This section is devoted to the approximations of the integrals appearing 
in Eq. (4.6), namely y(.s)d5 and ^ e^^""**y(s)d5. One of the simplest approximations for definite 
integrals is the trapezoidal rule, see, e.g., [13, Chapter 9] for an introduction to the topic of numerical 
integration. For any function / : R — > M with values in a metric space M it is defined as 



K 



f(a) + f(b) ^ ,,b-a 



k=\ 



KeN. (5.4) 
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detinite integral j'^* 

rh- 



Yl 



f(s)ds. We will usually set [a,b] - [n - l,n], n e M 
and = /; '. It is clear that T^^'_-^ ^^^f can be computed from knowledge of the values of / on the discrete 
time grid (0, h, 2h, . . .)• We shall now derive properties of the approximation error of convolutions of 
vector-valued functions with matrix-valued kernels. For any compatible functions / : [0, oo] — > Ef' and 
g : [0, 1] -> Mj(R) we use the notation 



'f,n -K-Un]8in- ■)/{■)■ 



I g(n- 

Jri-l 



(5.5) 



for the difference between the exact value of the convolution integral and the one obtained from the trape- 
zoidal approximation with sampling interval h. In the next proposition we analyse this approximation error 
if / is a multivariate CARMA process; this is the second big step towards discretizing Eq. (4.6). 

Proposition 5.4. Assutne that L is a Levy process. Let Y be a d- dimensional L-driven MCARMA process 
satisfying Assumption Al, let F : [0, 1] — » M^(R) a twice continuously dijferentiable function and denote 
by e^pIy „ the approximation error of the trapezoidal rule, defined in Eq. (5.5). //E||L(1)||* is finite then 



llfi^'v,.!! = Oih^''), as h — > 0. Moreover, the sequence E'ply is strictly stationary and strongly mixing. 



Proof. By the definition of e^'^j, (Eqs. (5.4) and (5.5)) we can write 



4'oy,„ ^^Y^ «!">'(« - 1 + ih) - r Fin - s)Y{s)ds, 



where 



Jh) 



F{\) 



«o - 2 



Fi^) (h) 77/1 ■ , ;-I 1 

, a. -F{l-ih), i-l,...h -1. 



Using Dirac's ^-distribution, which is defined by the property that J f(x)6 xf,(x)dx - fix^)) for all compactly 
supported smooth functions /, as well as the moving average representation (3.13) of Y we obtain that 

4oy = r Z af6„-i^ih(s) - F(n - s) Y(s)ds 

Jn-l j 

= r /i^af'5„_i+,v,(i)-F(«-.9) r Be^^'-"'>EpdL(u)ds. 



Theorem 2.4 allows us to interchange the order of integration so that we obtain 

Be^'^'-"^EpdsdL{u) 

,n-i) L ~r J 



hJ]a'^^d„-iM-F{n-s) 

OO J max{ii,n-l} j 

Xn-l /-*n 
-hY^af^6n-iv,h{s)-F{n- s) 
OO Jn-\ I 

hJ]af^6„-uih{s)-F{n-s 



Be^(^-"'>EpdsdL(u) 



Be^(-'-"^EpdsdL(u). 



With the notations 



Jo V 



Be^'ds and 



-CO . 



[0,1] 



Mrf,„(R), 



[t ^ /o [h af^6,-i^ihis) - Fit - s)] Be^'ds Ep, 



(5.6) 
(5.7) 



(5.8) 



we can rewrite the previous display as 

r" 

4oy„ = r*'''Z(« - 1) + G^''\n - u)dL(u), 

where we have used the moving average representation (3.12) of the state vector process X. This equation 
and the strict stationarity of X asserted in Lemma A.l,i) immediately imply that the sequence e^'*j, is 
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Strictly stationary and strongly mixing. By Proposition A. 6 there exists a constant C such that ||r'^^|| < Ch 
and ||G<''>(f)|| < Ch^ for all t e [0, 1], which impHes that 



hplrt ^ /!^*C*2*E||Z(n - 1)||* + 2*1 



G<''*(n - u)dL(u] 

Jn-\ 



The A:th moment of X{n - 1) is finite by Lemma A.l,iii), so it suffices to prove that the second term is of 
order 0{h^''). To this end we use the fact that £\ G'^'^Hn - u)AL{u) is an infinitely divisible random variable 
whose characteristic triplet (7^?', v^'') can be expressed explicitly in terms of the characteristic triplet 
(7^, Si, v^) of the Levy process L. Using the explicit transformation rules (2.2) one sees that the condition 



/ 
j 



WxW'-yfm^Oih^'-), r = 2,3. 

WI<1 



\\x\\'- v^!:\dx) ^Oih^'-), r = 2, 



so that we can apply Lemma 2.2 to conclude that E 11 1 ' Gin - M)dL(M)|| - 0(h^'') 



It remains to estimate Xqin). In view of the AR(1) structure given in Eq. (4.7) we compute estimates 

Zf(«) = e'^zf(«-l) + f„"), 1>) = 1S, n>l, (5.9) 

where ^''^ - Tj^^j ^-^e^^"^'^EqY{-) is the trapezoidal rule approximation to e^*""''£'^y(i)di and Z^'q is a 
deterministic or random initial value. We introduce the notation 

4>„ = Z,(«) - Z,(«). (5.10) 

It is easy to see that the sequence e"^ satisfies e^'^ = e^e^^^_^ + e^^ply^, n e N, where F : t t-^ e^'Eg 
and s^pIy„ is of the form analysed in Proposition 5.4. For the following result we recall the notion of 
an absolutely continuous measure. By Lebesgue's decomposition theorem ([23, Theorem 7.33]), every 
measure yU on R"' can be uniquely decomposed SiS - + yUj, where yUc and ju, are absolutely continuous 
and singular, respectively, with respect to m-dimensional Lebesgue measure. If Hc is not the zero measure 
we say that ju has a non-trivial absolutely continuous component. 

Proposition 5.5. Assume that L is a Levy process. Let Y be a d-dimensional L-driven MCARMA process 
satisfying Assumptions Al and A2. The sequence e'"^ defined by Eqs. (5.9) and (5.10) converges almost 

surely to a stationary and ergodic sequence which is independent o/Z^'q. If, for some integer k, E ||L(1)||*^ 

II 11^ oi- 
ls finite, then the absolute moment E k^ „ is of order 0(h ) as h ^ 0. If, moreover, the distribution of 

the random variable 

G{1 - s)dLis), G{s) = [G(sf (exp(As)Epf f , G{s) defined in Eq. (5.7), (5.11) 

has a non-trivial absolutely continuous component, then the process is exponentially strongly mixing. 
Proof. We first observe that 

n-2 

gW ^ ^(n-mjh) y vB (ft) . J 



and define the sequence e^'^ by 



v=0 



vB (ft) ^ 
t oy,;j-v' 
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By this definition, is obviously independent of Z^'q. Since s^*j, is strongly mixing by Proposition 5.4, 
it is in particular ergodic ([23, Exercise 20.5.1]). The sequence is the unique stationary solution of the 
AR(1) equations 



n e : 



and an application of [24, Theorem 4.3] to the infinite-order moving average representation of e^' shows 
that this last sequence is ergodic as well. It remains to prove that e^'*^ converges to e^''^ almost surely as 
n ^ oo. This follows from 



■ g*'"^ II 



< e' 



(n-l)B 



FoY,n- 



the fact that by Lemma 4. 1 the eigenvalues of the matrix B have strictly negative real parts, and the almost 
sure convergence of the last sum ([10, Proposition 3.1.1]). For the proof that the ^h moments of e^''^ are of 
order (9(/i^*) we use the following generalization of Holder's inequality, which can be proved by induction: 
for any k random variables Z\,. . . ,Zj^ and positive numbers pi, . . . ,pi. such that Yi^lPi- 1 it holds that 



E(Zi ■...■Zt)<f](EZf) 



i/p, 



(5.12) 



;=I 



Choosing pj - l/k, i - I, . . . ,k, and using that, by Proposition 5.4 there exists a constant C, independent 
of n, such that E Hsp^ ||*^ < Ch^'' it follows that 



lie*''' II 



Z ■ Zi 



»viB|| 



\\\ FoYji- 



'^FoY,n-Vk 



2k 



Zi 

Vv=() 



which is of order 0(h^^) because the sum is finite due to the eigenvalues of B having strictly negative 

Hi) 

real parts. In order to show that the sequence is strongly mixing, we note that the stacked process 
( e^'^ )^ satisfies the AR(1) equation 



*'x,n 

X(n) 







r 



Ah) 

^X,n-\ 

X(n - 1) 



+ Z„ 



-s) 



dL(s) 



where (Z„)„gz is an i.i.d. noise sequence. An extension of the arguments leading to [29, Theorem 1], 
which is detailed in the proof of [38, Theorem 4.3], shows that ARMA, and in particular, AR(1) processes 
are strongly mixing with exponentially decaying mixing coefficients if the driving noise sequence has a 
non-trivial absolutely continuous component, which is precisely what is assumed in the proposition. □ 

Remark 5.6. Sufficient conditions for the assumption made in the previous proposition to hold can be 
obtained from the observation that the random variable G(l - s)dL{s) is infinitely divisible and that 
its characteristic triplet can be obtained as in Eqs. (2.2). Sufficient conditions for an infinitely divisible 
random variable to be absolutely continuous, in terms of its characteristic triplet, can be found in [43] and 
[36, Section 27]. Since mixing is not our primary concern in this paper, and our results hold without it, we 
do not pursue this issue further here. 

5.3. Approximation of the increments AL„. If we combine what we have so far it follows that we can 
obtain estimates AL„ of the increments of the Levy process L by discretizing Eq. (4.6), that is 

p-q-i \ p-q-2 



[Al[Y](n)-Al[Y](n-l)] 



A,B"' + 



k=\ 



in)- 



(n 



1) 



(5.13) 
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>('0 



where the forward differences AJJ[F](n) are defined in Eq. (5.1), the estimates are computed recursively 
by Eq. (5.9) and the formula for the trapezoidal approximation T^'^'_^ ,Y is given in Eq. (5.4). Writing 



7(h) 



AL7 = AL„ + si'\ 



(5.14) 



the approximation error e*,'"' is given by 



v=0 



r v,(ft) _ v,(h) 1 



p-q 



The following theorem summarizes the results of the previous two subsections about the probabilistic 
properties of the sequence of approximation errors e*'*'. 



Theorem 5.7. Assume that L is a Levy process and Y is an L-driven multivariate CARMA process given 
by the state space representation (3.10) and satisfying Assumptions Al and A2. Denote by AL„ = L{n) — 

(h) 

L(n — 1) the unit increments of L and by AL„ the estimates of the unit increments of L obtained from 
Eq. (5.14). The stochastic process ' — AL — AL has the following properties: 

i) There exists a stationary, ergodic stochastic process e*''' such that ||fij,''' — ei,'''|| — > almost surely 
as n ^ 00. If the random variable defined in Eq. (5.11) has a non-trivial absolutely continuous 
component with respect to the Lebesgue measure, then e*''' is exponentially strongly mixing. 

ii) IfE ||L(1)||^*'° < 00, for some positive integer k, then there exists a constant C > such that 



supE||e^''^||'' < Ch 

neN 



1/2 



,k. 



(5.15) 



Proof. Both claims follow directly from Propositions 5.3 to 5.5. 



For the purpose of estimating a parametric model of the Levy process L based on the noisy observations 
AL it is important not only to have a sound quantitative understanding of the extent to which the true 
increments AL differ from the estimated increments AL, but also to know how strongly this diff'erence is 
aff'ected when a function is applied to the increments. This issue is investigated in the next lemma. 

Lemma 5.8. Let f : W" — > R'' be a function with bounded kth derivative and let I be some fixed positive 
integer Assume that E ||L(1)||**''° < 00, and further that, for any integer 1 < r < - 1 and any integers 
1 ^ i\, . . . ,ir ^m, the moments of the partial derivatives of f satisfy 

\\kl 



E||5„---5,„/(L(l))|| 



It then holds that 



supE 

nsN 



/(ALf)-/(AL„) 



(5.16) 
(5.17) 



Proof. By Taylor's theorem ([1, Theorem 12.14]) we have that 

k-l , 

/ (alI" J - / (AL„) = / (AL„ + sW) - / (AL„) = Yj h ^^'^"^ (^5'*)' + i^^"'' ^""') ' 

where 

m m 

d^^^f (AL„) {efj = 5, ■■ ■ d,f (AL„) s^''^ ■ ■ ■ 4"'''' 

'1=1 ',=1 

defines the action of the rth derivative of /. We note that 

m m 

(AL„) (eW)ll \h ■ ■ ■ (AL„)|| ||4ir ¥''f (AL„)|| ||4f f ' 

'1=1 v=i 
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m m 



and assumption (5.16) implies that 

'1=1 ',=1 

It follows from the boundedness of the ^h derivative of / that the remainder R (aL„; e'^^^ satisfies 



<c\W. 



coll 



for some constant C. In particular. 



<2'] 



i II^W/ (AL„) + 2'E \\r (aL„; 4"^)||' 



Vr=l 

k-1 k-\ 



'IB llgWII 



n = l ,-,=1 



By Theorem 5.7, the assumption that L(l) has a finite {kl)oth absolute moment implies that E||e^''^|| is 
of order Oih"^'^") as /; — > for all 1 ^ k ^ k, where the constant implicit in the (9( ) notation does not 
depend on n. It thus follows by an application of the generalized Holder inequality (5.12) with exponents 
pi = . . . = p/ = kl, pi+i = klik - 1) that 



/(ALf)-/(AL„) < n;=iE(||^(^')/(AL„)||*')' 



/■l,...,/7=l L /=i 



{r\+...+n)k \ k 

,.-1 ' 



+0 



/ n + -+'-/ \ 

=0 A f'l +■■■+'■' '"<'-"lo 



Since for any a e [0, 2] and any positive integer r it holds that {ra)Q < rao, the dominating term in this 
sum is the one corresponding X.o r\ - . . . - ri - I, which is of order 0(h^^^). Thus Eq. (5.17) is shown. □ 



6. Generalized method of moments estimation with noisy data 

In this section we consider the problem of estimating a parametric model P,) if only a disturbed i.i.d. 
sample of the true distribution is available. More precisely, assume that is some parameter space, that 
(P^ ; 1? e 0) is a family of probability distributions on R™ and that 

X"" ^(Xu...,Xn), R'"3X„~P^„, (6.1) 

is an i.i.d. sample from The classical generalized method of moments (abbreviated as GMM) is a 
well-established procedure for estimating the value of ??o from the observations X^, see for instance [17, 
18, 30] for a general introduction. After introducing some relevant notation and taking a closer look at two 
particularly important special cases of this class of estimators we state the result about the consistency and 
asymptotic normality of GMM estimators for easy reference in Theorem 6.1. Our goal in this section is to 
extend this result to the situation where the sample X^ from the distribution P,j„ cannot be observed directly. 
Instead, we assume that for each h > there is a stochastic process e*''^ not necessarily independent of X'^, 
which we think of as a disturbance to the i.i.d. sample X^, and the value of §o is to be estimated from the 
observation {Xi + \'arepsilon^^\ . .., X^ + e^')- In Theorem 6.2 we prove under a mild moment assumption 
that the asymptotic properties of the GMM estimator, as becomes large and h becomes small, are not 
altered by the inclusion of the noise process e*''\ Finally, we use this result in Theorem 6.5 to answer the 
question of how to estimate a parametric model for the driving Levy process of a multivariate CARMA 
process from discrete-time observations. 

Underlying the construction of any GMM estimator is the existence of a function g : R™ x — > R'' such 
that for Xi ~ P^„, 

E^(Xi,i?) = 0«j? = j?o- (6.2) 
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The analogy principle, that is the philosophy that unknown population averages should be approximated 
by sample averages, then suggests that an estimator of i?o based on the sample X^, given by Eq. (6.1), 
can be defined as 

1 ^ 



j?'^ = argmin,,g0 



(6.3) 

Wn 

where Wn is a positive definite, possibly data-dependent, qxq matrix defining the norm 



N- , 



|M|^„ : ^ R% \\x\\^^ = {x^Wnx)''\ xeR". 

As we will see shortly the choice of Wn influences the asymptotic variance E of the estimator given in 
Eq. (6.5). The optimal choice of weighting matrices Wn is described in Corollary 6.4. 

The advantage in considering a GMM approach to the estimation problem is that it contains many 
classical estimation procedures as special cases. Here, we only mention two such special cases which are 
particularly useful in the context of estimating a parametric model for a Levy process. It is an immediate 
consequence of the Definition 2. 1 that a Levy -process L is uniquely determined by the distribution of the 
unit increments L(l) - L(0), which, in turn, is characterized by its characteristic function Eexp{i(M, L(l) - 
^-(0))) = exp{t//{u)} in its Levy-Khintchine form (Eq. (2.1)). It is therefore natural to specify a parametric 
model for L by parametrizing the characteristic exponents, which amounts to defining, for each i? e 0, a 
function u i-> i/z^iu) of the form (2.1). A promising estimator for i?o in such a model is that value of & 
that best matches the characteristic function u i-> exp {(/'^(m)) with its empirical counterpart. This leads to 
choosing the function g in Eq. (6.3) as 



g:R'"x&^ W : (X, &) i 



Jjjj ^gi(«t.^> _ giAiKMl)^ 



n=l,...,Cj/2 

where « i , . . . , are suitable elements of R™ at which the characteristic functions are to be matched. The 
value of ^ e 2N as well as the particular Uj are chosen such that condition (6.2) holds, which means that the 
model is identifiable. Another special case of the generalized method of moments estimator of considerable 
practical importance arises if the parametric family of distributions is given as a family of probability 
densities p^(-)- In this case, the choice 

g:R"'x&^R'' : (X, V,, log p^(X) 

gives rise to the classical maximum-likelihood estimator with all its desirable asymptotic properties. 

In order to be able to state the classical result about the asymptotic properties of the generalized method 
of moments estimator for a general moment function g we introduce the notations 

Qo = , i}o)g(Xi , &oV, and G,, = -E^i^giXi , §o) 

for the covariance matrix of the moments and the generaUzed score matrix, respectively, where V denotes 
the diff'erential operator. 

Theorem 6.1 ([30, Theorem 2.6 and Theorem 3.4] ). Assume that (P;)),^© is a parametric family of prob- 
ability distributions and let X^ — {X\, . . . ,Xn) be an i.i.d. sample from the distribution of length N. 
Denote by the GMM estimator based on defined in Eq. (6.3). Assume: 

i) The domain & of § is a compact subset ofW and &o is in the interior of@. 

ii) For each & e &, the function x i— > g(x, ff) is measurable; for almost every x € R'" the function 

I— > g(x, "ff) is continuous on and continuously dijferentiable in a neighbourhood U of&Q. Moreover 
there exists a function a : W" R satisfying B,a{Xi) < oo such that for every &i,&2 £ U it holds that 
\W^g(x,-»i) - V,,g(x,'&2)\\ < a(x)\\§i - §2\\. 
Hi) Eg (Xi,§) - if and only if§ - §o. 

iv) E \\g (Xi, j?)!!^ < oo for all )? € 0, Qo is a positive definite qX q matrix and Go is a q X r matrix of rank 
r. 

v) Wn are q X q matrices converging in probability to a positive definite matrix W. 

vi) There exists a function a : R™ — > R satisfying Ea(X\) < oo such that \^\g(x,iff)g(x,§)^\\ < a(x) and 
W^g(x,m<a(x). 
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It then holds that &^ is consistent and asymptotically normally distributed, that is 

N^l^{¥ - §o) ^ -y^iOn S), ^ oo, (6.4) 
where the asymptotic covariance matrix S is given by 

S = [gJH'GoI"' GlwnoWGo [gIwGoY' . (6.5) 

A result analogous to Theorem 6. 1 holds in the more general situation, where we do not have access 
to the sample but only to a noisy variant. We first introduce the necessary notation, which we will 

need in the proof. The generalized method of moments estimator j?'^ '' of based on the disturbed sample 
X^.* = (Xi + sf\ ...,Xn + e*v') is defined as 

= argmin^,6, QN,h(&), (6.6) 
where the (random) criterion function Qw./i : ^ K.^ has the form 

and m/v_;, : — » is given as 

I ^ 

71=1 

Again, W^jj is a positive definite qxq matrix, which might depend on the sample X^-'\ As before we write 
Q(^) = EgiXi , §) g (Xi , fff and 

1 ^ T 
^NjrW -]^J]8{X„+ eli'\ &) g {X„ + \ ^) 

«-i 

for the covariance matrix of the moments g and its empirical counterpart. The sample analogue of the score 
matrix G{§) = -EV,jg (Xi,&) is defined as 

GN.m ^ - -Yj^,,g{x„ + 

Theorem 6.2. Assume that (Pi?),?e0 is a parametric family of probability distributions, that X^ is an i.i.d. 
sample from the distribution of length N and that, for each h > 0, there is a stochastic process 

gC"' = (*^«''')„<=N- Denote by d^-'' the GMM estimator based on X^''^ defined in Eq. (6.6). In addition to 
the assumptions of Theorem 6.1 assume: 

vii) There exists a function yS : — > satisfying PQi) as h ^ 0, such that 

sup E \\g [X„ + fiW, i?o) - giX„, )?o)|| = O (m) , ash^O. (6.8) 

n 

II l|2 

viii) For alM e@ it holds that sup„ E U (^n + en\ - g (X„, )?) 0, as h ^ 0. 

ix) For all i? e 0, the derivative of g satisfies sup„ E ||v^g + En\ '9') ~ ^ag (^i, '^)|| Q, as h ^ 0. 

If h — hf^ is chosen dependent on N such that N^^^/^ih^) as N oo, then it holds that {j^'^'f is 
consistent and asymptotically normal as N ^ oa with the same asymptotic covariance as d-^, given in 
Eq. (6.5). 

The proof of Theorem 6.2 closely follows the arguments in [30]. We give a detailed proof in order to 
clarify the impact of the additional parameter h and the difficulties arising from the need to take the double 
limit — > oo and /; — > 0. 

Proof of Theorem 6.2. The proof consists of four steps. In step 1 we show that N^^^mt^ki^o) is asymptoti- 
cally normally distributed with mean zero and covariance matrix Qq, that mt^i,(ff), Gi^kiff) and con- 
verge uniformly in probability to Eg (Xi , j?), G{§) and Q()?), respectively, and that QN.hi^u) is bounded in 
probability. The second step consists in showing that any estimator that approximately minimizes the 
criterion function gAf./i in the sense that m^^j, converges in probability to ??o. In step 3 we prove 



24 



PETER J. BROCKWELL AND ECKHARD SCHLEMM 



that Stochastic boundedness of N Qni,(&^''^) implies the stochastic boundedness of N^^^(-t)-'^''^ - §(i). We 
will see that steps 2 and 3 imply the consistency of §^''^ for any sequence of weighting matrices WN,h- In 
the last step the mean-value theorem is applied to the first-order condition for i?'^'' to prove the asymptotic 
normahty ofN^'^{d^^'' - §o). 

Step 1. In order to prove that N^^^mNi,{§o) is asymptotically normally distributed we observe that 

^ iV ^ N 

The first term in this expression is asymptotically normal by the Lindeberg-Levy Central Limit Theorem 
([23, Theorem 15.37]) since the summands g{X„,'&i)) are i.i.d. with finite variance. It therefore suffices to 
show that the second term converges to zero in probability as — » oo if /i = /z^ satisfies N^^^/Sih^) 0. 
For convenience we introduce the notation Y^'^ = g{Xi + e^''^ g{Xi,-0-Qy,hy the linearity of expectation 
and assumption vii) it follows that 



N 



N 



-1/2 \ ' vCO 

n=l 



YjYjl'> <N-^^^YjE\\Yf>\\^CN^'^/3(h), for some C > 0. 



(6.9) 



This proves that '^^ Tj^=\ converges in L', and hence in probability, to zero, thereby showing the 
asymptotic normality of N^^^m^^hi^o), that is 

nQ^'^N^^^mN,h(»o) -■ Unm ^ U ~ ,yV{%, asN ^ oo,h^ Q,N^'^P(h) 0. (6.10) 

We now turn to the uniform convergence in probability of mNjt{&), GM,h{'^) and Q.ffj,(§y. pointwise conver- 
gence of niN^hW to Eg (Xi,§) follows from the observation that 

, N , N 

H— 1 n=l 

As a sample average the first term converges to Eg(Xi,'&) as N oohy the law of large numbers ([23, 
Theorem 5.16]). As in Eq. (6.9) one sees that the second term converges in and therefore in probability 
to zero as — > oo and /; — > 0. Analogously, 

^ iV ^ N 

«=1 n-l 

converges pointwise in probability to G()?) - -EVg {X\ , i?) by assumption ix). Finally 

, N . N 

^Nj.m -^J]8(X„,^)8 (X„, + l-f (l-f )' 



«=i 

N 



«=1 n=l 

where we have again used the notation Y^'^ = g (x„ + s,^\ {X„, The first term in this expression for 

^N,hW converges to Q()?) = Eg (Xi , ff) g (Xi , j?)^ by the law of large numbers, the second term converges 
to zero in L' and in probability due to assumption viii). An application of the Cauchy-Schwarz inequality 
to the third term shows that 

N , N 



:supE|Ux„,,?)(yf)l 

■■.supE\\g(X„,-»)\\\K^\\< yjEWgiXumf JsupE||F^' 



The first factor is finite by assumption iv), the second one converges to zero as /i — > by assumption viii). 
By assumptions ii) and vi), the limiting functions § t-^ Eg (Xi ,§),§ t-^ G{§) and § Q(j?) are continuous 
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and dominated and since the domain is compact by assumption i)) we can apply Lemma A. 8 to conclude 
that the convergence is uniform in Taking into consideration the assumed convergence in probability of 
Wnji (assumption v)) as well as Eq. (6.10), Lemma A. 7 implies that QN.hi^o) is bounded in probability. 

Step 2. In this step the consistency of any estimator satisfying QN.hi^'^''') — » is proved. In step 
1 we have established the uniform convergence in probability of mt^j^iff) to 'Eg(Xi,ff). Together with 

assumption v) this implies that supj^^g \QN,hi'9) - W-g (^i, '^)llvi'| 0- To establish consistency of d^'^ we 
shall show that for any neighbourhood U of §o and every e > there exists an N^iU) and an h^{U) such that 
P [¥•'' e f/) > 1 - e for all > N,(U), h < h,(U). For given U we define 5(U) := mf f^e&\u Wg (J\ , ■&)\\w 
which is strictly positive by assumptions i) to iii). Choosing N^iU) and h^iU) such that 



sMQNm - mg(x,,n\w\ < 5{u)i2\ >\ - eii 



for all > N,{U) and h < h,(U) it foUows that 

¥{¥' € U) >¥{\\Eg(Xu¥'')\\^ < 6(U)) 

>^{QN,hi¥'') < ^ and sup\Q!,j,i&)-\\EgiXumw\ < ^ 
>1 -e, 

where in the last line we used the relation P(A n B) > P(A) + P(B) - 1. 

Step 3. This step is devoted to the implication that if QN,h(¥-'^) is bounded in probability, then the 
sequence N^^^{¥''' - §o) is bounded in probability as well. The assumption A^ QN.h(¥''^) = 0^(1) implies 

that QN,h(¥''^) -U and therefore, by the previous step, that ¥''^ §o- By the mean-value theorem there 
exist ??* e of the form --1^q + Ci(¥''' - ??o), < c,- < 1, / = 1, . . . , r, such that we can write 



=n]I^UN.h - GnM*)n'i\¥'' - &o), (6. 11) 

where Gff ij{§*) denotes the matrix whose j'th row coincides with the ith row of G({)-*) and Unj, is defined 
in Eq. (6.10). By applying the triangle inequality of the norm to the vector 

one obtains that 

\\GN,hmN'^\¥''' - ,?o)|P^^,, < 2 ||QyV^,,J^^^^ + 2NQN.h(¥'^). 

Since U nji converges in distribution to a standard normal and Wnj, converges in probability, the first term 
on the right hand side of the last display converges in distribution by Lemma A. 7 and is in particular 
bounded in probability. By our hypothesis, A^ QN,h{¥''^) is bounded in probability and so it follows that 
\\GN,hi&*)N^^^{¥''' - &o)\\w is bounded in probability as well. It follows from the uniform convergence 

p 

m probabiUty of GM,hi'&) to G(§), the fact that i?* — > -d-Q and Lemma A. 9 applied to the rows of Gnji that 

GN,h{ffWff,hGN,h(&'') GjWGo, which in tmn impHes that N^l^{¥'^ - §q) is bounded in probabiHty. 

Step 4. In this last step we prove that the estimator = argmin^gQ QN,h('&) is asymptotically normally 

distributed. The definition of ¥''^ implies that QNjii"''^) < QNjii^o)- We have shown in step 1 that 

N QnjA&o) is bounded in probability and hence so is N QN,hi¥''^). This implies by step 2 that is 

consistent and that N^^^(¥''' - i?o) is bounded in probability. Since ¥''^ is an extremal point of Q^j, we 

obtain by setting the derivative equal to zero that GN,h0'^''YWN,hN^'^mN,h{¥''') = 0. By combining the 
Taylor expansion (6.1 1) with this first-order condition it follows that 

- GN,,(¥''YwN,hQ.fuN,h - G!,,,(¥''YWmj,GmM*)n''\¥'' - §o). 
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As before one sees that GN,h{& ) WNj,GN,h(^*) converges in probability to the non-singular Umit Gq WGq, 



which means that N^I''-(§^''' - i?o) = 
probability approaching one. Since 



GN,h(^''yWN,hno' Unm exists with 



Gw,UF''')^WM,/,G^,,,(r) 



[GlWGol 



GlW, 



it follows from Lemma A.7 that N^/^0^J' - )?„) -i [gJIVGo] ' Gj^WQ^V, a normally distributed ran- 
dom vector with covariance matrix 2 - |^Gy WGoj GJ^WQ.()WGq |^Gg WGoj . If the dimension r of the 
parameter space is equal to the dimension q of the moment vector and the matrix Go is thus square or if 

W = it follows that I = [gJQq'Go]"'. □ 

Remark 6.3. It seems possible to extend most aspects of the asymptotic theory of the generalized method of 
moments beyond the Central Limit Theorem 6.1 to deal, for example, with non-compact parameter spaces 
and applications to hypothesis testing based on a disturbed sample as in Theorem 6.2. We choose not to 
pursue these possibilities further in the present paper 

In view of Lemma A. 9, assumption v) of Theorem 6.2 is satisfied if we choose Wnm - WNjii'd^''^) where 
d-^''' is a consistent estimator of )?o and the functions § i-> W^^hi^) converge uniformly in probability to 
{)■ i-> In this way one can construct a sequence Wnji of weighting matrices converging in probability 

to Qg'. For this two-stage GMM estimation procedure one has the following optimality result. 

Corollary 6.4. Let be the estimate of-d- obtained from maximizing the W-norm ofmt^j^{d)for any fixed 

q X q positive definite matrix W and let i?'^"'' be the estimate obtained from using the random weighting 
matrix 



n=l 



(6.12) 



Under the conditions of Theorem 6.2, the estimator i?'^''' is consistent and asymptotically normally dis- 
tributed. In the partial order induced by positive semidefiniteness, the asymptotic covariance matrix of 

the limiting normal distribution, |^Gq Qq 'Goj , is smaller than or equal to the covariance matrix obtained 
from every other sequence of weighting matrices Wmji- 

Proof. It has been shown in the proof of Theorem 6.2 that the preliminary estimator is consistent and 
that the sequence of functions § i-> Q.ff^h{^) converges uniformly in probability to the function i? i-> 0(i?). 
It then follows from Lemma A. 9 that the sequence W^^h of weighting matrices converges in probability to 
Qg' and from Theorem 6.2 that d^''' is asymptotically normal with asymptotic covariance matrix 

[^o^o'^o] Gof^o'^o^^o^^o [GqQq'Go] = [GqQq'Go] . 

To show that this is smaller than or equal to the asymptotic covariance matrix of an estimator obtained from 
using a sequence of weighting matrices that converges in probability to the positive definite matrix W we 
must show that the matrix 

A = [gJwGo]"' G^WOoWGo [gIwGoI' - [gJQq'Go]"' 
is positive semi-definite. To see this it is enough to note that A can be written as 



A = 



WGo {G^WGoY'] hr - n-^"'-Go (gJQq'Go)"' GIQ,"'-] \nfWGo (cJWGo)" 



Since the factor in the middle is idempotent and therefore positive semidefinite and semidefiniteness is 
preserved under conjugation, the matrix A is positive semidefinite. □ 

We can now state and prove our main result about the asymptotic properties of the generalized method 
of moments estimation of the driving Levy process of a multivariate CARMA process from discrete ob- 
servations. This method can be used to select a suitable driving process from within a parametric family 
of Levy processes as part of specifying a CARMA model for an observed time series. We assume that 
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© is a parameter space and that (Li>),jj=0 is a family of Levy processes. The process Y is an L^^ -driven 
multivariate CARMA(p,q) process given by a state space representation of the form (3.10) and we assume 

that /z-spaced observations F(0), Y{h), . . . , Y{N + (p - q - l)/i) of Y are available on the discrete time grid 

— W 

(0,h, . . . ,N + (p - q - l)h). Based on these observed values, a set of approximate unit increments AL„ , 
« = 1, . . . , A', of the driving process is computed using Eq. (5.13). For each integer and each sampling 
frequency h^^ e N, a generalized method of moments estimator is defined as in Eq. (6.6) by 



= argmin,jg0 



(6.13) 

Wnj, 



where g : R'" x — > R'' is a moment function and W^jt e Mq(R) is a positive definite weighting matrix. 

The following theorem asserts that the sequence 0^''^^)^ of estimators is consistent and asymptotically 
normally distributed if /j^i is chosen such that Nhj^/ converges to zero. 

Theorem 6.5. Assume that C R'' is a parameter space, that {L§)§(i% is a family of m- dimensional Levy 
processes and that Y is an L^^-driven multivariate CARMA process satisfying Assumptions Al and A2. 

Denote by ir^'* the generalized method of moments estimator defined in Eq. (6.13). Assume that, for some 

II 112^" 

integer k, the functions f^:x^ g(x, &) possess a bounded kth derivative, that E L,j„(l) is finite and 
that the partial derivatives of the functions f§ satisfy 

Epi,- ...■diJi,{Li,„{l))f'' <oo, 1< /i, <m, l</^<yt-l, )? € 0. (6.14) 

Further assume that, for each x e R™, the function & i— > g(x, &) is dijferentiable, that, for some integer I, 
the functions hf) : x ^ Vffg(x, ff) have a bounded Ith derivative and that the partial derivatives ofh§ satisfy 

E||5„ ■...■5,-,,/i,,(L^„(l))||'<oo, 1 < < m, 1<^</-1, §e@. (6.15) 

If, in addition, assumptions i) to vi) of Theorem 6.1 are satisfied with Xi replaced by L,y„(l), and ifh — h^ 

is chosen dependent on N such that Nhff converges to zero as N tends to infinity, then the estimator is 
consistent and asymptotically normally distributed with asymptotic covariance matrix given in Eq. (6.5). 

Proof. It suffices to check conditions vii) to ix) of Theorem 6.2. All three conditions follow by assumptions 
(6.14) and (6.15) from Lemma 5.8, which also shows that the function /? in vii) can be taken asyS : h i-> /z'^^. 
Consequently, the assumption N^^^/Sih^) — > from Theorem 6.2 simplifies to the requirement that NhN 
converges to zero and the result follows. □ 

Remark 6.6. If we introduce the notation n = N/h for the total number of observations of the MCARMA 
process, the high-frequency condition from Theorem 6.5 becomes nh^ — > 0, which is the rate commonly 
encountered in the hterature when deahng with the estimation of continuous-time processes. 

7. Simulation study 

In this section we illustrate the estimation procedure developed in this paper using the example of a 
univariate CARMA(3,1) process Y driven by a Gamma process. A similar example was considered in [11] 
as a model for the realized volatility of DM/$ exchange rates. Gamma processes are a family of univariate 
infinite activity pure-jump Levy subordinators (r/,fl(f)),gu, which are parametrized by two positive real 
numbers a and b, see, e.g., [2, Examplel.3.22]. Their moment generating function is given by 

f-W" = (1 -feM)-«', a,b>0; 



the unit increments Tb.ain) - ^b,a(n - 1) follow a Gamma distribution with scale parameter b and shape 
parameter a. This distribution has density 

r(a)b 

mean ab and cumulative distribution function 

r r{a;x/b) 
FbAx)^ Aa(^)d^=^p^, (7.1) 

Jo r(fl) 

where r(-) and r(-; •) denote the complete and the lower incomplete gamma function, respectively. 
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In contrast to the example studied in [11] we chose to simulate a model of order (3, 1) in order to 
demonstrate the feasibility of approximating the derivatives D^F which appear in Eq. (5.13). The dynamics 
of the CARMA process used in the simulations are determined by the polynomials 



P{z) = + 2z^ 



3 1 
+ -, 
2 2 



and Q{z) = 1 + z. 



corresponding to autoregressive roots Ai - -I and /I23 = -1 ± i. The process Y is simulated by applying 
an Euler scheme with step width 5 x 10""* to the state space model (cf. Theorem 3.2) 





1 







dZ(f) = 


1 


X(t)dt + 







_1 _3 _2 

L 2 2 J 




1 



dr2,i(0, F(f) = [ 1 1 ]x(t). 



(7.2) 



The initial value X{Q) is set to zero. Another possibiUty would be to sample Z(0) from the marginal 
distribution of the stationary solution of Eq. (7.2), but since the effect of the choice of X(0) decays at an 
exponential rate this does not make a substantial difference. A typical realization of the resulting CARMA 
process Y on the time interval [0, 200] is depicted in Fig. lb. In the case of finite variation Levy processes 
there is a pathwise correspondence between a CARMA process and the driving Levy process. Since this 
applies in particular to Gamma processes, it is possible to show in Fig. la the path of the driving process 
which generated the shown realization of Y. Such a juxtaposition is useful in that it allows to see how big 
jumps in the driving process can cause spikes in the resulting CARMA process. 




20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 



(a) r2,i-process (b) r2j-clriven CARMA(3,1) process 

Figure 1 . Typical realization of a r2,i-process and the corresponding CARMA(3,1) process with 
dynamics given by Eq. (7.2) 



The first step in the implementation of our estimation procedure is to approximate the increments Ar„ 
of the driving Gamma process from discrete-time observations of the CARMA process Y. For the value 

h - 0.01 of the sampling interval Fig. 2 compares the true increments with the approximations AF,, ob- 
tained from Eq. (5.13) both directly and in terms of their cumulative distribution functions. We see that the 

— (/i) 

approximations Ar„ are very good for each individual increment and that therefore the empirical distribu- 
tion function of the reconstructed increments closely follows the CDF (7. 1) of the gamma distribution even 
if the observation period is rather short. 

(A) 

In the next step we used the approximate increments AF^ and a standard numerical optimization routine 
to compute the maximum likelihood estimator 

(g^-W, . argmax(„,,,,«. (l (^«") ' ^^'^^ 

n=l 

or, equivalently, 

N 
n=l 



(^-•(''),«-^(^)) = argmin(,,„,. 
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(a) Bar chart of the increments of driv- 
ing 1 -process. White bars represent 
the true increments, black bars indicate 
the values of the estimates. 




2 4 6 8 10 12 



(b) Cumulative distribution function 
of the increments of the driving Ta.i- 
process. The dashed line shows the 
true CDF given by Eq. (7.1), the solid 
line represents the empirical distribu- 
tion function of the estimates. 



Figure 2. Comparison of the true increments of a Gamma process with parameters (b, a) = (2, 1) 
to the estimates of the increments computed via Eq. (5.13) from discrete observations of the r2,i- 
driven CARMA(3,1) process defined by Eq. (7.2) on the time grid (0, 0.01, 0.02, . . . , 30). 



In this form, the maximum Ukelihood estimator falls into the class of generalized moments estimators. 
From the explicit form of the function g - 'V(h.a) log //,,« it is easy to check that the assumptions of The- 
orem 6.5 are satisfied. Since in the present case, and for maximum likelihood estimators in general, the 
dimension of the moment vector is equal to the dimension of the parameter space, the choice of the weight- 
ing matrices W^j, is irrelevant and the estimator is always best in the sense of Corollary 6.4. 

With the goal of confirming the assertions of Theorem 6.5 we first focused on consistency and inves- 
tigated the effect of finite sampling frequencies. Figure 3 visualizes the empirical means and marginal 
standard deviations of the maximum likelihood estimator (7.3) obtained from 500 independent realizations 
of the CARMA process Y from Eq. (7.2) simulated over the time horizon [0, 200] and sampled at instants 
(0,h,2h, ■ ■ . ,N) for different values of h. The picture suggests that the estimator ^n,{K) ^^N.(h)^ biased 
for positive values of h, even as tends to inifinity, but that it is consistent as h tends to zero. This is in 
agreement with Theorem 6.5 and reflects the intuition that discrete sampling entails a loss of information 
compared with a genuinely continuous-time observation of a stochastic process. 

Finally, we conducted another Monte Carlo simulation with the goal of confirming the asymptotic nor- 
mality of the maximum likelihood estimator (7.3). Figure 4 compares the empirical distribution of the 
estimator ^^200,(0.ooi)^g200,(0.ooi)j j.^ asymptotic normal distribution asserted by the Central Limit Theo- 
rem 6.5. The points indicate the values of the estimates obtained from 500 independent realizations of the 
CARMA process (7.2). The dashed and solid straight lines show the empirical mean (1.9772, 1.0217) of 
the estimates and the true values (2, 1) of the parameter {b, a), respectively, which are in good agreement. 

The dashed and solid ellipses represent the empirical autocovariance matrix | ^2^45 0^78^ ) ^ 

the estimates and the scaled asymptotic covariance matrix S/200 ~ | ^1^55 0^78^ j>< 10"^, respectively. 

Their closeness, which is also reflected by the similarity of the ellipses in Fig. 4, means that, even for finite 
observation periods and sampling frequencies, T./N is a good approximation of the true covariance of the 
estimator (b^''-''\ a'^'*''' j and can thus be used for the construction of confidence regions. For the present 
example, the asymptotic covariance matrix E, given by Eq. (6.5), can be computed explicitly as 

2-' - -E log A. (r.«(i))]^,^,.^,, = ( "If, ^[^^ 





/ 1/4 


1/2 \ 


(fc,£l)=(2,I) 


\ 1/2 
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0.8 1 1.2 1.4 1.6 1.8 2 2.2 

b 

Figure 3. Empirical means (x) and standard deviations of the estimators ^fe™ <''), fi^™ ''''^ based on 
500 independent observations of tiie MCARMA process (7.2) on the time grid (0, h, 2h, . . . , 200) 
for h e {0.5,0.1,0.05,0.01,0.005,0.001,0.0005). The dashed lines indicate the true parameter 
value (b, a) = (2, 1). 

where ij/i denotes the trigamma function, that is the second derivative of the logarithm of the gamma 
function. Figure 4 also compares histograms of ^200,(0.ooi) ^j^^ ^2oo,(o.ooi) densities of the marginals of 
the bivariate Gaussian distribution with mean (2, 1) and covariance matrix S/200. The agreement is very 
good, in accordance with the Central Limit Theorem 6.5. 
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Appendix A. Auxiliary results 

In this appendix we collect some auxiliary results and technical proofs to complement the derivation of 
the results presented in the main part of the paper. 

A. 1 . Auxiliary results for Section 3. 

Lemma A.l. Assut^ie that L is a Levy process and that Y is an L-driven multivariate CARMA process 
given by the state space representation (3.10) and satisfying Assumption Al. Then the following hold. 

( i) The process Y is strictly stationary. 

( ii) The paths of Y are p — q — I times differentiable. Moreover, for j — I, . . . , p, the paths of the jth 
m-block of the state process X are p — j times differentiable. 

(Hi) For any k > and any t,s,e R, finiteness q/ E ||L(1)||'^ implies finiteness of both E||X(f)||* and 
E ||y(s)||*. Conversely, finiteness of the kth moment ofX(t) implies finiteness of L(\). 
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Figure 4. Comparison of the empirical distribution of the estimator ^^20o.(o.ooi) -200.(0.001)^ based 
on 500 realizations of the r2,i-driven CARMA(3,1) process given by Eq. (7.2) to the asymptotic 
distribution implied by the Central Limit Theorem 6.5 

(iv) If'E\\L(l)\\'' is finite for some k > 0, then the process Y is strongly mixing with exponentially decaying 
mixing coefficients. 

Proof. The first claim is an immediate consequence of the state space representation (3.10). Parts ii) and iii) 
follow from [27, Propositions 3.32 and 3.30], respectively, if we observe that Ep is injective. The assertion 
iv) follows from [28, Theorem 4.3], see also the proof of [27, Proposition 3.34]. □ 

The following lemma relates strong mixing of a continuous-time process to strong mixing of functionals 
of the process. It is used in the proof of Proposition 5.3. 

Lemma A.2. Let X — (X,)((=r be an -valued (exponentially) strongly mixing stochastic process. If for 
each « e Z, the random variable Y„ is measurable with respect to cr{Xj : n — 1 < 7 < n) then the stochastic 
process (Y„)„^z (exponentially) strongly mixing. In particular, if f : R'^^I"-'! R™ is a measurable 
function, then the -valued stochastic process (/((^n-i+r)/E[o,i]))„gz is (exponentially) strongly mixing. 

Proof. This follows immediately from Eq. (3.15), the definition of the strong mixing coefficients. □ 

A.2. Auxiliary results for Section 5. The following lemma collects some useful properties of forward 
differences; in particular it shows that if a function / is sufficiently smooth, then the derivative Wfif) is 
well approximated by AJJ [/](?) • 

Lemma A.3. Forh > and a positive integer v let the forward-differences AJj [/] (f ) be defined by Eq. (5.1). 
The following properties hold: 

i) For every positive integer k < v and every function f, one has AJj[/] = |^A]J"'^[/](-)]- 

ii) If the function / : R ^ R"" ;s v + 1 times continuously differentiable on the interval [t, t + vh] then 
there exist t* e [f, f + vh\, i — 1, . . . , m, such that 

Kim) = D7(f) - ^D''+7(f*), (A.l) 

where D'^^^f(t_*) is the vector whose ith component equals the ith component of T)^^^ fit*). In particu- 
lar, for every polynomial p of degree at most v, one has AJ^[p] = D''p. 
If the (v + l)th derivative of f is not assumed to be continuous it holds that 

|K[/](f)-D7(f)|| </2 sup IId^+'/WII- (A.2) 

se[t,t+vh] 
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Proof. Property i) is immediate from the definition (5.1). The assertions of ii) and iii) follow from a 
component-wise application of Taylor's theorem ([1, Theorem 5.19]). □ 

In the next lemma we will show that the supremum of an Ornstein-Uhlenbeck-type process has finite 
absolute ki\\ moments if and only if the driving Levy process has finite Ath moments. This will allow us to 
effectively employ the error bound (A. 2) for multivariate CARMA processes. 

Lemma A.4. Let iL(t)),^Q be an m-variate Levy process and let A € Miv(R), B € Mt^j„(W) be given 
coefficient matrices. Assume that all eigenvalues of A have strictly negative real parts and that X — 
(X(f))/>o the unique stationary solution of the stochastic differential equation 

dZ(f) = AX{t)At + BdL(f). (A.3) 

Further denote by 

Z*(f) = sup ||Z(5)|| (A.4) 

the supremum o/||X|| on the compact interval [0, f]. It then holds that, for every f 6 R and every k > 0, the 
kth moment 'E{X*{tyf is finite if and only iffE\\L(V)\f is finite. 

Proof. If E||L(1)||* is infinite it follows from Lemma A.l,iii) that E||Z(f)||* is infinite as well for every 
f G R and that therefore E {X*{tyf must be infinite. The other implication requires more work. 

We first note that X*{t) < Yli=\ X*(t) where X*(t) - supo^^^, |X'(i)| is the supremum of the /th component 
of X over the interval [0, t]. Since each X' is a semi-martingale, [33, Theorem V.2] shows that there exists 
a universal constant Ck such that E {x*(t)^ < q ||Z'||jj^i, where the norm |M|^t is defined by 



r |dy,(^)| 
Jo 



1/2 



= inf _ E dy,(i) + \Mu M,]; 

Here, the infimum is taken over all decompositions of X' into a local martingale M, and an adapted, cadlag 
process V, with finite variation, and [■, ■] denotes the quadratic variation process. In our situation, Eq. (A.3) 
defines a canonical decomposition of X', / = 1, . . . , A^, into the finite variation process V, = (V,(0)r>o given 
by 



V,(f) = e] 



Jo 



Z(0)+ I AX{s)As + tE&L{\) 

where e,- denotes the /th unit vector in R'^, and the martingale M, - (Mi(t))i^Q given by 

Mi{t)^ejB [L(t) - tEL(l)] . 

Since clearly. 



(X*(t)f = sup (X\sf + ...+ X^'isf) < (Xlitf + ...+ X*^(t)^) 

0<i</ 

N 

<A?*/2 max X'itf < N'''^ V x;{t)^ 



it suffices to bound the ^th moments of X*(t) in order to obtain a bound for the A;th moment of X*{t). The 
former can be estimated as 



E(x;(f))' <q||z'||^, <c, 



The first term in this expression is seen to satisfy 



j| |dy,(^)|j +E[M,,M,]f2 



(A.5) 



E^jl \dVi{s)\j <Ejj| \ejAX(s)\ds + t\ejBL{l)\^ 

f E||Z(.)||*d. + f^|rfE||L(l)||' 
Jo 



IIAII* 



< CX3 
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where the finiteness of the integral E II^Cs)!!*^ ds follows from the assumption that E HZCi)!!*^ is finite and 
the strict stationarity of X. For the second term in Eq. (A. 5) one obtains the bound 

E[M,-,M,]f^^ = E[ejB[L,L],B^eif'^ 

<\\B\\''E\\[L,L],f^ ^2''\\B\f\\\E^'''^t^'^ + E I I xx^N(ds,dx) 

y Jo Jr'" 

where we have used [20, Theorem 1.4.52] to compute the quadratic variation of the Levy process L with 
characteristic triplet (7^, iP, v^). To see that this expression is finite we observe that 



XX N(ds,dx) 



k/2 



\\xfN(ds,dx) 



Jo Jk" 
lim f f 

=m*^2limE( r f ||A:||^A^(d5,djc)) =: m*/^ lim EF*/^ 
\Jo Jm>6 I 



\\xfN(ds,dx) 



k/2 



\k/2 



where we have applied the Monotone Convergence Theorem ([23, Theorem 4.20]) to interchange the order 
of expectation and passing to the limit. By [36, Proposition 19.5], for each e > 0, the random variable 

- J\\x\\>e ■^('^■^' '^-'•) is infinitely divisible with characteristic measure - (/i|[o,r] ® 
where cp^ : [0, t] x > e) — > maps (s,x) to and with characteristic drift 7^ - J^ yp^idy). From 
this it follows that 



Jo 



f 

J\\x\ 



\\x\f y'^(dx) ^ t [ 

J\\x\ 



\\xf v'^(dx) + 1 



Jm>i 



(dx) < 00, Ve > 0, 



and 



r yp,(dy) = t f 
Jo J|W 



|W|2v^(d^)<f 



Lemma 2.2 then implies that limt^o EF*^^ is finite, which completes the proof. □ 

For the upcoming proof of proof of Lemma 5.2 we first show the following locality property of the 
approximation errors e^jy'^\ 



f 

J\\x\ 



\\xfv''(dx) + t 



/ 

Jm>i 



|A:||^v^(djc) < 00, Ve>0. 



Lemma A.5. For every positive integer v > 2 and every function f, the approximation error e'j!:'^^ is a 
function only of the increments [fit) - f(n) : n < f < n + vh}. This function is independent of n. In 
particular, e'^'}''^ is an i.i.d. sequence. 

'l 

Proof. The claim can be shown by direct calculations; Lemma A.3,i) implies that 

A:|';l(")=A;,Ar'|';l<"> 



^Z<-'>'-'t;')';<--*) 



(n) 



\s)ds 



i=0 
v-l 



" \ ' I Jn+ih LJ(K/,,_]< -</|<.! 



_I ■■■dfi 



ds. 



Using that the set {fv_i < fv_2 < ■ ■ ■ < f i < is congruent to the (v - 2)-dimensional simplex in the 
hypercube with side lengths s - ty^i and that thus 

1 



I dfv-2--dfi = 

J/„_,<r„_2<-<'i<^ (v-2)! 



(s-ty-iY 
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we obtain that 



rl7(-ir'-f'')r 

h^(y- 2)1 Jo [-J^^ ^ \ / jj„,. 



'-n+(/+l)/i 



r2^z<-')'-'i ; )L, |(-'.-.rv(.-.)d,„,d. 



/(fv-i)dfv 



It is easy to see that ^"^/^'^''''(■s - ty^iY ^ds is equal to Pv_/,(n - fy-i + ih) for some polynomial Py/, of degree 
V - 2. It then follows from Lemma A.3,ii) that 

f-l / _ 1 \ ^n+(i+l)/i 

J](-ir'n . (i-fv_ir2d5 = A,';-'[p,,,](n-f,_i) = 0, Vf,_ie[0,«], 

which implies that the first term in the last expression for AJJ (n) vanishes. It is similarly easy to see 
that 

1 ^ /v- 1\ r"+*'+'*'' r' 



Consequently, 

1 /v - 1\ r*'^'"' 

/j^Cv - 2)! I ' / Jih Jo 

which completes the proof of the first part of the lemma. The fact that Levy processes have stationary and 
independent increments together with the last display implies that the sequence e!;*^''* is i.i.d. □ 

'l 

Proof of Lemma 5.2. Let e > be given. By the right-continuity of the process L there exists (5^ „ such that 
\\L{n + t) - L(n)\\ < e for all t G [0, 6f „]. Hence, assuming vh < d^^„, Eq. (A.6) implies that 



This proves that Iklv*'''!! ^ as /i — > 0. We now turn to the absolute moments of ely"'. Again it entails no 

II 'l'" II 'Z.'" 

loss of generality to assume that n - 0. Equation (A.6) and the triangle inequality lead to 



v,(h) 



V,(/J) 



1 



/iV-2)! 



•f) 11^.(011 dfds 



An application of Holder's inequality with the dual exponent k' determined by + = 1 shows that 
the last line of the previous display is dominated by 

(§( ; )X X"-"-H (XX 



1 



h^{v-2)\ 



ii^^(Oirdfds 



c. 



E {vh-t)\\L(t)\tdt 
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where the constant C depends only on v and k and is given by 

v-I 



C 



1 



(v-2)! 



1 yl" 1] r(,-+i);c'(v-2H2_ .i'(v-2H2l 



Proposition 2.3 asserts the existence of a constant C such that E||L(f)||* < C'f*^^*^° for all t < vh. Conse- 
quently 



v,(h) ' 



Jo {kl{k\ + \]{kl{k\ + 2] 



showing that E ^ || = (9(/!'^^***°*) and thereby completing the proof of the lemma. □ 

We now turn our attention t the approximation of integrals. The following result provides a quantitative 
bound for the accuracy with which the trapezoidal rule approximates a definite integral if the integrand is a 
smooth function. 

Proposition A.6. Let [a, b] <zWbe an interval and let K be a positive integer, 
i) Assume that f : [a, /?] — > R is a twice dijferentiable function. Then 



f f(s)ds-T^^ ,/ 

J a 



< ^ sup 1/ (f)| . 



te[aM 



ii) Assume that F : [a, b] W' is a twice dijferentiable function. Then 

{b - af yfd 



r F{s)ds-Tf^,^F 

J a 



r F(s)ds-Tl,^F 

J a 



sup||f"(0||. 

Hi) Assume that F ; [a,b] M^(R) is a twice dijferentiable function. Then 

te[aM 



(A.7) 



(A.8) 



(A.9) 



Proof. Part i) is [13, Lemma 9.8]. To see that ii) holds it is enough to apply i) componentwise to obtain 
that 



£ F{s)ds-Tf^, ,,^F <V5max £ F{s)ds - Tf^, ,^F 



Xb-afyfd 



max sup \F'j'{ti) 



(b - af yfd 



I ,, I ib- a)^ -\/d 11 „ 11 



t£[a,h] 



The claim (A.9) about matrix-valued integrands follows from the fact that Md{9.) is canonically isomorphic 
to W'' and that the operator norm and the Euclidean vector norm induced by this isomorphism satisfy ([41]) 

^ ||M||^,2 < ||M|| < ||M||j.,2 , for all M e Mrf(R). □ 

yd 

A. 3. Auxiliary results for Section 6. The following lemmas are used in the proof of the Central Limit 
Theorem 6.2. 

Lemma A.7. For sequences (Y„)„^i, (Z„)„^i of vector- or matrix-valued random variables the following 
hold: 

P d 

i) For every constant c, Y„ — » c if and only ifY„ c. 

ii) IfY„^ Yoo and Z„ - F„ then Z„ ^ Y^. 

Hi) Denote by supp Y„ the support of Y„. IfY„^ Fco and the function f is defined on Hnsi supp Y„ and 
continuous on an open set containing supp Foo then fiY,,) — > /(Fco)- 



Proof. Parts i) and ii) are proved in [44, Theorem 2.7]. Assertion iii) is [23, Theorem 13.25]). 
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The next result we will need is a uniform version of the weak law of large numbers, given by [30, 
Lemma 2.4]. 

Lemma A.8. Assume that for every ■& e ®, ® a compact subset ofW, there is a sequence (Y„(&))„^i of 
independent identically distributed random variables with finite expectation — EFi()?) < oo. Further 
assume that for each §' e & the random function i? i— > Fj (i?) is almost surely continuous at §' and that there 
exists a random variable Z satisfying EZ < oo such that sup^^^g < Z. It then holds that ■& i— > 

is a continuous function and the time averages Yf^(§) = Yjn=i ^ni^) converge uniformly in probability to 
that is sup^^Q \\YnW - iff(-»)\\ 0. 

p 

Lemma A.9. For each e 0, let (Yn(§))„^i be a sequence of random variables. If i'„(i?) Foo(»?) 
uniformly in ■&, the sequence i of random elements of® converges in probability to some Hoa and the 

mapping § i— > Y^ifF) is almost surely continuous at j^oo, then Y„(§„) — > Yoo('&cx,)- 
Proof. For any e > it holds that 

^(\\Y„(&n) - YU»^)\\ < e) >P(||F„(1?„) - Y^(^„)\\ < I and \\Y^m - Foo(t?«)|| < |) 

>f{\\y„{&„) - YU&„)\\ < I) + pJllFooO?,,) - ^^^(^^oo)!! < - 1 ^ 1- 

The first probability in the last line converges to one as n — > oo by the assumption of uniform convergence 

p 

of Yn to Foo, the second because Too is almost surely continuous at «?oo and ^ i?oo. □ 
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